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1.  INTRODUCTION 


This  report  documents  the  Software  Partitioning  Schemes  for  the 
Advanced  Simulation  Computer  Systems  Study  performed  by  Teledyne  Brown 
Engineering  (TBE)  under  Contract  No.  F33615-78-C-0013  for  the  Air  Force 
Human  Resources  Laboratory  (AFHRL).  The  report  contains  five  sections. 
Section  1  introduces  the  study  objectives,  background,  approach,  and  re¬ 
sults.  Section  2  defines  the  software  partitioning  problem  environment, 
partitioning  goals,  and  alternative  approaches.  Section  3  presents  the 
technical  details  of  the  resultant  software  partitioning  algorithm 
developed  and  manually  demonstrated  under  this  contract.  Section  4 
addresses  implementation  considerations  and  recommends  a  schedule  of 
tasks  for  algorithm  automation  verification  and  validation.  Section  5 
concludes  with  a  brief  recapitulation  of  the  study  findings,  related 
work,  and  areas  of  further  study. 


1.1  OBJECTIVES 


The  overall  objective  for  this  study  was  to  design  software 
partitioning  techniques  that  can  be  used  by  the  Air  Force  to  partition  a 
large  flight  simulator  program  for  optimal  execution  on  alternative  mul¬ 
tiple  processor  configurations.  In  particular,  the  Air  Force  needs  a 
software  partitioning  algorithm  for  use  in  conceptualizing,  manipulat¬ 
ing,  and  evaluating  candidate  flight  trainer  computational  designs. 
Major  design  objectives  pursued  by  TBE  in  deriving  the  software  parti¬ 
tioning  algorithm  included  emphasis  on  potential  automated  steps,  manual 
feasibility  demonstration,  and  recommended  implementation  steps  for  its 
use  by  the  Air  Force. 


1.2  BACKGROUND 


It  has  been  evident  for  some  time  that  significant  increases  in 
computer  system  performance  may  be  realized  by  using  two  or  more  smaller 
processors  connected  in  parallel,  as  opposed  to  one  large  processor. 
This  concept  has  been  utilized  in  many  real-time  flight  simulators  where 
each  of  several  computers  performs  a  specific  task.  Future  trends  are 
toward  further  expansion  of  this  concept  to  include  not  only  tasks  that 
may  be  executed  in  parallel  but  also  tasks  that  must  execute  serially 
because  of  temporal  relationships.  This  causes  many  multiple  processor 
configurations  to  be  applicable  to  flight  training  simulators  and  com¬ 
plicates  the  problem  of  allocating  the  software  among  the  processors. 

Typically,  the  design  of  a  computer  system  is  an  iterative  pro¬ 
cedure.  Certain  portions  of  the  hardware  and  software  can  be  designed 
independently,  but  the  remaining  portions  must  be  designed  interac¬ 
tively.  Uith  the  rising  cost  of  software,  it  has  become  more  and  more 
important  to  know  the  effect  of  computer  hardware  design  on  the  design  of 
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the  software  as  well  as  the  effect  of  the  software  design  on  the  selec¬ 
tion  and  interconnection  of  the  hardware  to  develop  the  optimum  design 
for  the  computer  system. 

This  study  has  pursued  the  development  of  an  algorithm  that  will 
facilitate  the  partitioning  of  both  parallel  and  sequentially  dependent 
tasks  to  a  given  hardware  configuration.  The  algorithm  has  the  potential 
of  being  automated. 


1.3  APPROACH 

This  study  was  comprised  of  three  phases:  Phase  I  -  Literature 
Search,  Phase  II  -  Simulator  Analysis,  and  Phase  III  -  Algorithm  Design 
and  Demonstration.  This  three-phased  approach  provided  a  logical 
sequence  of  research  and  analysis  that  resulted  in  the  delineation  of  the 
partitioning  technique  presented  in  this  report. 

The  Phase  I  literature  search  focused  on  current  documentation 
in  two  major  technical  areas.  The  first  area  concerned  flight  training 
simulator  computational  subsystem  designs.  The  second  area  addressed 
software  partitioning  schemes  for  allocation  of  parallel  and  serial 
application  tasks  to  advanced  multiple  processor  configurations. 

The. Phase  II  effort  was  subdivided  into  two  parts.  The  first 
part  was  the  analysis  of  literature  collected  to  properly  identify  the 
software  partitioning  goals  with  respect  to  flight  training  simulator 
designs.  The  second  part  was  the  selection  and  expansion  of  the  specific 
approach  for  the  techniques  to  be  applied  in  the  algorithm  design  to 
achieve  the  design  goals.  Partitioning  approaches  considered  included 
manual  allocation  schemes,  real-time  dynamic  task  allocation  schemes, 
and  a  mathematical  goal  program  statement  of  the  allocation  problem.  The 
mathematical  goal  program  model  approach  was  selected  because  of  its 
potential  for  systematically  obtaining  optimal  partitions  and  related 
quantitative  measures  in  an  automated  mode,  which  are  responsive  to 
alternative  candidate  design  features.  The  features  and  measures  that 
can  be  modeled  are  described  in  Section  3  in  terms  of  the  mathematical 
model,  algorithm  design,  and  algorithm  feasibility  demonstration.  Model 
measures  include  task  sizing  and  timing;  processor  utilization;  memory 
storage,  retrieval,  and  sizing;  and  real-time  task  constraints  and 
relationships . 

Some  problems  were  encountered  in  pursuing  the  Phase  III  design 
to  implement  the  mathematical  goal  program  model  when  allocating  a  large 
number  of  tasks  and  data  blocks  to  a  large  number  of  processors,  memo¬ 
ries,  and  peripherals  comprising  the  candidate  configuration.  It  became 
evident  that  a  heuristic  goal  program  algorithm  needed  to  be  designed 
that  interfaces  with  a  linear  program  optimizer  to  obtain  "good"  task 
partition  allocations  for  large  partitioning  problems.  TBE's  Input/ 
Output  Requirements  Language  (IORL)  supplemented  with  flowcharts  was 


used  Co  delineate  Che  algorithm  design  and  provide  the  steps  for  perform¬ 
ing  a  manual  demonstration  of  the  algorithm's  feasibility. 

1.4  RESULTS 

One  of  the  most  important  results  of  this  study  was  a  mathemati¬ 
cal  model  defining  partitioning  parameters  and  measurements.  From  these 
parameters,  a  set  of  guidelines  has  been  recommended  for  the  establish¬ 
ment  of  a  centralized  automated  flight  training  simulator  computational 
design  data  base  repository  for  the  Air  Force.  These  design  parameters 
address  five  major  areas,  including  flight  training  simulator  computa¬ 
tional  interface  requirements,  baseline  software  Cask/data  descriptions 
(independent  of  hardware  implementation),  candidate  hardware  configura¬ 
tion  specification,  a  technology  data  base,  and  (most  important)  design 
evaluation  user  interface  data  options.  These  parameters  along  with  the 
partitioning  mathematical  model  provide  steps  for  the  implementation  of 
an  automated  partitioning  algorithm  for  real-time  simulators.  Detailed 
recommendations  for  algorithm  implementation  are  provided  in  Section  4. 

Section  5  expands  TBE's  findings,  including  related  aspects  of 
our  Advanced  Multiple  Processor  Configuration  study  contract  encompass¬ 
ing  areas  for  further  research  and  development.  In  the  multiple  proces¬ 
sor  area,  the  impact  of  heterogeneous  processor  configurations  and 
potential  reconfiguring  capabilities  is  currently  being  investigated.  A 
major  area  for  future  study  is  the  impact  of  higher  order  architectures 
on  partitioning  allocation. 


2.  SOFTWARE  PARTITIONING 


To  develop  the  software  partitioning  algorithm  design  goals,  TBE 
addressed  the  definition  from  both  general  system  software  design  and 
particular  flight  training  simulator  software  design  viewpoints.  This 
section  supplies  the  basic  definition  of  the  software  partitioning 
environment,  the  design  goals  selected  for  flight  training  simulator 
software  partitioning  features,  and  alternative  approaches  considered 
during  this  study. 


2 . 1  PARTITIONING  ENVIRONMENT 


To  fully  appreciate  the  software  partitioning  environment  and 
its  associated  steps,  one  must  first  examine  its  relationship  with  the 
system  life  cycle.  Then,  flight  training  simulator  system  life-cycle 
peculiarities  must  be  considered.  The  questions  posed  by  this  study  in 
both  these  areas  concerned  the  identification  of  the  software  applica¬ 
tion  task  features  that  are  peculiar  to  advanced  real-time  simulation 
computational  systems  and  that  influence  the  software  design  partition¬ 
ing  process.  The  system  and  flight  trainer  life  cycles  are  now  described 
for  the  general  system,  followed  by  a  description  of  the  flight  training 
simulator  software  partitioning  features.  Emphasis  was  placed  on  iden¬ 
tifying  software  features  that  characterize  an  optimal  partitioning 
scheme  and  that  account  for  alternative  candidate  configurations  and 
provide  partitions  that  meet  real-time  load  balance  constraints. 

2.1.1  System  Life  Cycle 

Figure  1  depicts  the  major  phases  of  a  system  development 
effort.  The  development  phases  that  directly  relate  to  or  influence 
software  partitioning  include  subsystem  interface  requirements,  sub¬ 
system  functional  specification,  and  subsystem  detailed  design.  In 
addition,  during  the  operational  maintenance  of  the  system,  any  changes 
that  are  deemed  necessary  (to  either  correct  for  a  design  deficiency  or 
oversight,  or  to  implement  an  expanded  capability)  imply  that  a  reparti¬ 
tioning  of  tasks  may  be  needed  to  accommodate  the  required  change.  This 
phasing  relationship  to  partitioning  holds  for  any  system,  whether  it  is 
an  aircraft,  computer  center,  air  defense  system,  ...,  or  a  flight  train¬ 
ing  simulator  system. 

For  purposes  of  this  study,  the  detailed  design  phase  was 
selected  as  the  major  area  where  software  partitioning  parameters  become 
known.  Prior  to  this  phase,  a  system  partitioning  is  generally  performed 
to  denote  the  major  subsystems  and  their  respective  interface  functions. 
After  the  detailed  design  phase,  actual  hardware  is  procured  from  which 
prototype  build  implementation  is  initiated.  Therefore,  the  detailed 
design  phase  has  the  greatest  influence  on  mapping  software  tasks  to 
hardware  and  vice  versa. 

i 


Required  operational  capability 


Figure  1.  The  system  life  cycle  addresses  partitioning  at  subsystem, 
function,  and  detailed  design  phases  for  new  and/or  modi¬ 
fied  system  development  efforts. 
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The  design  of  a  multiple  computer  system  traditionally  has  begun 
with  the  hardware  selection.  Once  the  computer  system  has  been  selected, 
the  development  of  software  begins.  During  development  and  even  after 
the  system  is  installed  in  the  field,  there  are  various  modifications  to 
both  the  hardware  and  the  software.  Because  software  has  traditionally 
lagged  the  hardware  development  activities,  the  hardware  has  had  a 
direct  influence  on  software  partitioning.  As  the  details  of  the  soft¬ 
ware  tasks  become  known,  projected  hardware  resources  are  typically 
found  to  be  inadequate,  which  necessitates  the  acquisition  of  additional 
processors  and/or  memories  to  meet  system  interface  requirements.  A 
software  partitioning  algorithm  must  be  able  to  address  software  appli¬ 
cation  design  parameters,  which  are  independent  of  a  particular  hardware 
configuration,  to  permit  a  variety  of  design  tradeoffs  to  be  evaluated 
for  alternative  candidates  prior  to  the  exact  configuration  selection. 

Once  a  system  enters  the  operational  phase,  maintenance  becomes 
the  prime  cost  factor  (indeed,  maintenance  cost  is  the  largest  cost  of 
the  system  life  cycle).  Change  and  configuration  controls  are  necessary 
for  a  system  or  subsystem  of  any  significant  size.  As  technology 
advances,  new  software  and  hardware  architectures  may  need  to  be  imple¬ 
mented.  A  tradeoff  must  be  made  to  decide  whether  to  convert  or  totally 
redesign  existing  software.  A  software  partitioning  algorithm  should 
provide  useful  information  regarding  allocation  of  current  baseline 
software  design  tasks  to  the  new  or  modified  hardware  architecture.  As 
with  design  development,  software  partitioning  in  the  operational  main¬ 
tenance  phase  addresses  the  design  details  of  any  proposed  changes. 

The  key  factor  for  flexible  software  partitioning  (from  the  sys¬ 
tem  life-cycle  viewpoint)  is  the  ability  to  define  software  design 
attributes  in  terms  of  the  dependent  application  software  task/data  flow 
relationships.  The  software  attributes  should  remain  independent  of, 
but  be  mappable  to,  a  particular  processor  architecture.  The  prolifera¬ 
tion  of  requirements  languages  (RLs)  and  higher  order  languages  (HOLs) 
is  a  testament  to  this  emerging  philosophy  in  the  DOD  community.  The 
distinction  between  an  RL  and  an  HOL  is  that  RLs  are  not  currently 
automated  to  the  extent  of  target  machine  code  generators  for  the  RL.  An 
HOL  such  as  JOVIAL,  HAL-S,  or  PL-1  supports  interpretation,  data  manage¬ 
ment,  and  code  generation  from  machine-independent  HOL  source  code  to  an 
intermediate  level  language  that  can  then  be  specifically  translated  to 
any  one  of  the  languages  supported  by  different  target  machines.  Once 
the  tasks  have  been  defined  in  a  suitable  RL  and  HOL,  the  problem  still 
exists  as  how  they  can  best  be  partitioned  or  allocated  to  the  candidate 
architecture.  Once  allocated,  the  resulting  partition  should  be  evalu¬ 
ated  in  terms  of  predicted  performance  and  cost/risk  assessments  by  a 
software  partitioning  model.  Iterative  feedback  from  this  performance 
evaluation  model  can  then  be  used  to  perturb  the  partition  based  on 
performance  penalties  to  derive  a  well-balanced  software  execution 
sequence. 
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In  addition  to  problems  associated  with  the  general  system  life- 
cycle  environment,  the  simulation  training  system  environment  offers 
special  considerations  and  problems  with  respect  to  software  partition¬ 
ing.  Aircraft  systems  are  continuously  being  upgraded,  and  this  causes 
changes  to  training  requirements.  Manual  interfaces  change  when  new  or 
modified  weapons  systems,  embedded  onboard  computer  systems,  and  opera¬ 
tional  tactical  policy  changes  are  introduced.  These  problems  are 
really  no  different  from  problems  encountered  during  the  maintenance 
phase  of  the  actual  system.  The  key  issue  is  when  and  how  actual  system 
changes  are  received,  evaluated,  and  introduced  into  the  training 
requirements. 

Actual  system  test  and  performance  measurement  tools  can  and 
should  provide  useful  inputs  for  simulator  training  software  required  to 
support  the  new/modified  devices.  In  the  case  of  embedded  computer  sys¬ 
tems,  simulated  training  scenarios  could  provide  additional  reliability 
tests  of  the  actual  onboard  computer  systems  as  well  as  the  prime  goal  of 
training  personnel.  As  a  result  of  these  considerations,  the  partition¬ 
ing  algorithm  should  facilitate  modular  design  definition  input  changes 
and  permit  new  technology  configurations  to  be  introduced  as  needed  to 
support  a  given  evaluation.  This  should  also  include  the  ability  to  fix 
allocations  of  certain  functional  tasks,  such  as  a  set  of  onboard  com¬ 
puter  tasks,  while  permitting  others  to  be  allocated  by  the  partitioning 
algorithm. 

AFHRL  supplied  a  benchmark  problem  and  the  detailed  design  docu¬ 
ments  and  source  code  listings  from  the  Advanced  Simulator  for  Under¬ 
graduate  Pilot  Training  (ASUPT,  now  known  as  Advanced  Simulator  for 
Pilot  Training  (ASPT)).  These  documents  were  analyzed  to  obtain  esti¬ 
mates  on  the  complexity  and  sizing  of  flight  simulator  software  parti¬ 
tioning.  This  analysis  identified  50  major  application  (both  real-time 
and  support)  tasks  (some  of  which  would  be  duplicated  to  support  multiple 
training  stations,  instructor  consoles,  weapon  systems,  and  aircraft 
models).  The  results  of  this  analysis  were  presented  at  an  interim 
briefing. 

It  should  be  noted  that  a  task  is  related  to  the  application. 
Its  ultimate  operational  realization  may  be  software,  firmware,  hard¬ 
ware,  or  a  combination  of  these,  depending  on  the  selected  design  config¬ 
uration.  The  tasks  being  considered  for  the  partitioning  algorithm  are 
related  to  the  computational  subsystem  of  real-time  flight  training  sim¬ 
ulators. 

Further  analysis  revealed  that  the  trainer  computational  sub¬ 
system  is  really  comprised  of  a  set  of  smaller  functional  subsystems, 
such  as  simulator  facility  control,  visual  computational  support,  and 
simulated  aircraft  mathematical  models;  thus,  the  number  of  processors 
and  number  of  tasks  for  which  selected  software  functions  are  being 


allocated  is  reduced  to  approximately  30  tasks  to  three  processors  using 
a  common,  shared  multiport  memory.  In  summary,  flight  trainer  computa 
tional  configurations  have  both  a  functional  partitioning  of  processors 
and  a  task  partitioning  within  each  functional  processor  group. 


2.2  DESIGN  GOALS 

Software  partitioning  of  tasks  to  alternate  candidate  multiproc¬ 
essor  configurations  must  be  a  systematic  process  based  on  measurable 
evaluation  goals.  The  selected  design  goals  for  the  partitioning  algo¬ 
rithm  developed  are  as  follows: 

(a)  With  software  system  task  flow  inputs  given,  partition 
tasks  to  a  user-specified  multiprocessor  hardware  configu¬ 
ration  subject  to  input  constraints 

(b)  Identify  interdependencies  among  the  tasks  that  require 
communication  links 

(c)  Incorporate  dynamic  performance  evaluation  feedback  to 
determine  the  best  partition  to  preclude  system  deadlocks 
and  account  for  critical  path  task  precedence  orderings 

(d)  Provide  a  means  of  balancing  the  processing  load  as  a  func¬ 
tion  of  processor  utilisation,  which  is  evenly  distributed 
among  the  processors  such  that  no  one  processor  is  satu¬ 
rated  while  others  remain  idle  for  appreciable  periods  of 
time 

(e)  Provide  cross  reference  of  task(s)  assigned  to  each  proces¬ 
sor  and  processor(s)  assigned  to  each  task 

(f)  List  critical  constraints  when  a  valid  partition  is  not 
obtainable 

(g)  Provide  a  development  cost  estimate  as  a  function  of  task 
sizing  and  instruction  mix,  which  is  related  in  terms  of 
assigned  candidate  processor  language  compilers  and  debug 
tool  measures. 

In  deriving  this  set  of  goals,  several  issues  have  been  dis¬ 
cussed  pertaining  to  the  evaluation  environment  in  which  the  partition¬ 
ing  algorithm  is  to  operate.  The  baseline  set  of  questions  was: 

(a)  At  what  point(s)  in  the  system  development  cycle  is  the 
algorithm  to  be  used? 

(b)  What  timeframe  and  computer  resources  are  anticipated  for 
candidate  evaluations? 


(c)  To  what  extent  will  the  system  requirements  be  formatted? 
In  what  format? 

(d)  To  what  extent  will  the  alternative  candidate  design  con¬ 
figuration  be  documented?  In  what  format? 

The  answers  to  these  questions  relate  directly  to  the  level  of 
software  partitioning  and  types  of  system  parameters  that  can  be 
modeled,  allocated,  and  measured.  In  summary,  there  are  no  definitive 
answers  to  these  questions  since  each  flight  trainer  evaluation  tends  to 
be  tailored  to  specific  needs.  This  does  not  mean  that  systematic 
methodologies  and  standards  do  not  exist,  but  they  do  differ  from  one 
project  to  another.  The  potential  use  of  an  automated  partitioning 
algorithm  will  require  systematic  collection  and  development  of  flight 
trainer  requirements,  software  specifications,  and  candidate  configura¬ 
tion  inputs.  This  contract  has  concentrated  on  the  definition  of  parti¬ 
tioning  algorithm  logic  in  terms  of  design  inputs  which  are  transformed 
via  technology  data  and  user  evaluation  options  to  assist  and  assess  the 
partitioning  of  tasks  for  a  given  candidate  configuration. 

2.3  ALTERHATIVE  APPROACHES 

Software  partitioning  to  date  has  been  primarily  a  manual  proc¬ 
ess  based  on  experience  gained  in  development  of  previous  flight  simula¬ 
tors.  The  designer  community  continually  evolves  and  isiproves  partition 
allocations  using  projected  resource  requirements  and  implementing  the 
partition  to  see  how  well  it  performs.  In  some  cases,  real-time  alloca¬ 
tion  is  determined  by  a  master  computer  using  a  predefined  assignment 
scheme  that  incorporates  certain  dynamic  application  considerations. 
These  schemes,  whether  manual  or  partially  preprogrammed  controlled,  are 
not  easily  automated,  since  they  generally  require  that  a  specific  sys¬ 
tem  allocation  be  implemented  for  a  given  configuration.  Manual  projec¬ 
tions  are  limited  to  a  few  alternatives  for  a  given  type  of  configura¬ 
tion,  but  they  must  be  redone  for  alternative  configurations. 

In  surveying  potential  automated  models  to  meet  the  design 
goals,  the  basic  problem  to  be  solved  is  one  of  distributing  the  software 
system  tasks  and  related  data  blocks  to  a  candidate  hardware  architec¬ 
ture  network  such  that  a  representative  stressing  simulation  load  is 
handled.  In  general,  this  type  of  problem  is  typical  of  mathematical 
programming  problems  addressed  in  an  operations  research  (OR)  environ¬ 
ment.  Within  this  field,  there  are  a  variety  of  algorithms.  The  follow¬ 
ing  are  some  of  the  more  familiar: 

1.  Transportation  problem  of  product  transport  from  production 
locations  to  warehouses  and  customer  distribution  centers  to 
meet  customer  demand  at  minimum  cost. 


2.  Traveling  salesman  optimal  route  determination  to  service 
customers 

3.  Knapsack  packing  of  items  required  for  a  camping  trip  to  be 
distributed  evenly  among  campers 

4.  Capital  budgeting  problem  of  choosing  among  independent 
investment  alternatives  to  maximize  return  subject  to  cur¬ 
rent  investment  fund  constraints 

5.  Machine  shop  production  scheduling  to  meet  product  demand 
deadlines  with  minimum  machine  restructure  between  jobs  and 
given  employee  mix. 

The  software  partitioning  problem  has  attributes  similar  to  each  of 
these. 

In  the  case  of  the  software  partitioning  problem,  a  descriptive 
statement  of  the  model  is  as  follows: 

1.  Find  a  partition  that  best  satisfies  alternative  evaluation 
priority  functions: 

a.  Balance  the  processing  load  among  the  processors 

b.  Balance  the  memory  storage  utilization 

c.  Minimize  development  costs. 

2.  Subject  to: 

a.  Real-time  task  resource  requirements 

b.  Predicted  performance  simulation  feedback. 

When  defining  a  software  task  partitioning  model,  a  number  of 
factors  must  be  considered.  The  model  can  very  quickly  get  out  of  hand 
in  terms  of  size  for  current  optimization  techniques.  Thus,  the  model 
design  developed  under  this  contract  restricted  itself  to  a  static  allo¬ 
cation  problem  that  is  mathematically  stated  as  a  linear  goal  program 
problem  in  Section  3.1.  It  is  static  in  that  it  is  a  generalization  of 
the  real-time  application  tasks  to  be  allocated  to  a  given  candidate 
configuration.  In  this  sense  it  is  not  a  dynamic  real-time  allocation 
algorithm.  The  static  model  is  very  useful  in  the  candidate  design 
evaluation  iK>de,  since  many  numbers  are  based  on  predicted  task  sizing 
and  timing  plus  anticipated  computation  iteration  frequencies  to  support 
given  training  loads.  The  static  model  permits  average  to  worst-case 
growth  analysis  in  a  systematically  controlled  evaluation  environment, 
which  provides  the  means  to  ensure  a  complete  design  description  has  been 
input  and  independently  provides  a  measure  of  processor  utilization, 
memory  utilization  and  predicted  software  development  cost. 


Even  in  the  static  model  environment,  optimization  data  base 
sizing  and  numerical  roundoff  problems  are  encountered  for  evaluation  of 
a  computational  system  involving  much  more  than  three  processors,  20 
tasks,  40  data  blocks,  and  four  memories.  Specific  sizing  is  addressed 
in  Section  3.2.  For  this  reason,  a  heuristic  model  has  been  designed.  A 
heuristic  model  is  a  means  of  limiting  computations  to  a  logical  sequence 
of  iterative  improvements  via  allocation  tradeoffs  until  a  certain 
objective  level  is  either  found  to  be  feasible  or  a  bottleneck  has  been 
isolated. 


This  section  has  discussed  partitioning  considerations.  The  re¬ 
sultant  algorithm  design  details  are  highlighted  in  Section  3.  Imple¬ 
mentation  considerations  are  given  in  Section  4.  Section  5  incorporates 
areas  for  further  research  with  respect  to  optimizer  techniques  and  data 
base  selection. 


3.  MODEL  DEVELOPMENT 


Software  partitioning  model  development  is  presented  from  three 
different  technical  viewpoints  in  this  section,  including  the  mathemati¬ 
cal  definition,  the  detailed  design  highlights,  and  a  feasibility  demon¬ 
stration  synopsis.  the  model  is  expressed  in  generic  computational 
system  terms  where  the  major  components  are  tasks,  data  blocks,  proces¬ 
sors,  and  memories  that  are  partitioned  to  service  an  external  baseline 
load  environment.  The  mathematical  model  definition  delineates  all  the 
parameters  and  the  basic  relationships  that  must  be  satisfied  for  a  valid 
partition.  It  also  provides  a  statement  of  objective  functions  that 
permits  optimization  of  the  partition  when  the  basic  relationships  are 
found  to  have  a  feasible  solution  (i.e.,  a  feasible  partition). 

The  algorithm  design  highlights  are  presented  here  in  terms  of 
the  systematic  procedural  step  features  with  cross-references  to 
detailed  appendices.  Appendix  A  provides  user  input  information.  Out¬ 
put  report  formats  are  provided  in  Appendix  B.  Appendix  C  contains  the 
feasibility  demonstration  that  emphasizes  the  user  environment  of  input 
formulation,  critical  intermediate  step  results,  and  final  output  summa¬ 
ries.  Detailed  computations  and  design  logic  are  enumerated  in 
Appendix  D. 


3.1  MATHEMATICAL  STATEMENT 


This  mathematical  statement  provides  mathematical  terminology 
and  definitions  for  alternate  evaluation  priorities  and  constraint  form¬ 
ulation  based  on  a  generic  statement  of  a  candidate  configuration  for 
which  a  set  of  software  tasks  are  to  be  partitioned.  Each  mathematical 
symbol  is  defined  when  first  introduced.  In  addition,  Appendix  C  con¬ 
tains  a  master  list  of  mathematical  symbols  and  related  design  defini¬ 
tions.  A  special  effort  has  been  made  to  use  a  unique  symbol  for  a  given 
entity.  It  utilizes  a  combination  of  symbol  definition  with  a  combina¬ 
tion  of  linear  programming  and  goal  programming  sredel  formulation  termi¬ 
nology.  Although  knowledge  of  there  modeling  and  solution  techniques  is 
helpful,  it  is  not  essential  to  the  understanding  of  the  basic  expression 
of  the  software  partitioning  problem  model.1  The  solution  techniques 
with  respect  to  the  software  partitioning  model  are  considered  in  the 
design  highlights  of  Section  3.2.  The  model  is  now  stated. 

3.1.1  Mathematical  Terminology 

The  mathematical  model  formulation  permits  the  major  decision 
variables  to  be  enumerated  in  terms  of  a  baseline  software  load  for  a 
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given  real-time  interval  of  length,  t.  In  the  case  of  the  flight 
trainer,  r  might  be  chosen  to  represent  the  maximum  time  permissible  for 
a  complete  real-time  cycle.  The  baseline  load  could  represent  a 
stressing  training  mix  of  tasks  and  data  relationships  that  ssist  be 
performed  to  support  the  given  trainer  facility  exercise;  for  example,  a 
two-on-one,  air-to-air,  combat  maneuvering  situation  may  be  selected. 
For  more  detailed  partitioning  loads,  r  could  be  selected  to  represent  a 
specific  segment  of  the  real-time  cycle  to  further  analyze  and  partition 
parallel  versus  dependent  task/data  flow  relationships. 

The  major  decision  variables  (outputs  of  the  algorithm)  with  re¬ 
spect  to  software  partitioning  allocation  are  defined  as  follows: 


tp 


tp 


tp 


mb 


hfa 

ampti 


1,  if  task  t  is  assigned  to  execute  on  processor  p 
0,  otherwise 

number  of  task  t  executions  on  processor  p  for  the 
evaluation  problem  time  period 

development  cost  to  implement  task  t  on  processor  p  as 
currently  partitioned 

1,  if  memory  storage  m  contains  block  b 

0,  otherwise 

number  of  sieisories  where  block  b  is  stored 

nuisber  of  times  input  block  i  of  task  t  is  input  for 
task  t  on  processor  p  from  memory  m. 


mpto 


number  of  times  output  block  o  of  task  t  is  written  or 
updated  by  task  t  on  processor  p  to  memory  m. 


These  outputs  are  determined  for  a  given  set  of  software  task  and  candi¬ 
date  architecture  inputs.  The  basic  algorithm  control  inputs  are 
denoted  by: 


T  -  number  of  tasks  to  be  allocated  to  processors 


P  -  number  of  processors 
M  -  number  of  memories 


B  -  number  of  distinct  storage  blocks  to  be  allocated  to  memories 
(this  includes  instruction  and  data  blocks) 


Q  -  number  of  communication  links 

8  -  maximum  number  of  input  and/or  output  blocks  per  task. 

The  values  of  these  parameters  control  the  overall  algorithm  sizing, 
timing,  and  looping  logic. 

The  baseline  task  load  may  be  represented  as  configuration- 
independent,  processor-dependent,  and  memory-dependent  input  parame¬ 
ters.  The  configuration-independent  input  parameters  are  defined  as 
follows  for  each  task,  t: 


Nt  -  number  of  times  task  t  is  to  be  executed  during  the  evalua¬ 
tion  interval,  t,  for  which  partitioning  is  being  done 

-  maximum  time  limit  per  task  t  execution 

I£  -  number  of  distinct  input  blocks  for  task  t 

-  global  data  block  index  for  task  t  input  block  i 

A  ^  -  percent  of  information  input  for  task  t  from  block  i 

0c  -  number  of  distinct  output  data  blocks  for  task  t 

°tQ  -  global  data  block  index  for  task  t  output  block  o 

Qto  -  percent  of  information  output  from  task  t  to  block  o. 

The  processor-dependent  task  inputs  are  defined  as  follows  for 
task  t  on  processor  p: 


c 


tp 


time  for  task  t  execution  on  processor  p 


R  -  resource  task  management  coefficient  for  task  t  on  proces- 
p  sor  p  if  time  or  data  enabled  task  (these  tasks  require 
periodic  enablement  or  polling  by  the  processor  to  which 
they  are  assigned) 


r  -  resource  task  management  per  task  t  execution  on  processor 
p  p  for  slaved  enabled  task  (these  tasks  are  enabled  by 
another  task) 


d  -  the  cost  coefficient  for  developing  task  t  to  run  on  proc- 
p  essor  p  independent  of  allocation 

6  -  the  cost  coefficient  for  resource  management  of  task  t 

F  development  on  processor  p. 
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Section  4.2  discusses  the  implementation  means  for  computing  these 
values  based  on  independent  task  descriptions,  processor  configuration, 
and  a  technology  data  base.  The  mathematical  model  assumes  that  these 
values  are  known. 


In  addition  to  the  task-to-processor  allocation  relationships, 
the  storage  allocation  of  blocks  to  memories  operates  on  a  similar  con¬ 
cept.  A  master  block  list  of  distinct  data  and/or  instruction  blocks  is 
independently  defined  and  then  mapped  via  the  candidate  configuration 
and  technology  memory  parameter  inputs  to  supply  the  following  parame¬ 
ters  with  regard  to  block  b,  memory  m,  processor  p,  and  communication 
link  q: 

-  length  in  bits  of  block  b  when  stored  in  memory  m 

L ,  -  length  of  memory  m  in  bits 

a  *  1,  if  access  from  memory  m  to  processor  p  exists,  i.e., 
p  there  is  at  least  one  access  link  q  for  m  and  p 

*  0,  if  otherwise 

a  -  bits/second  transfer  rate  from  memory  m  to  processor  p 
p  based  on  statistical  composite  of  access  links  for  p  and  m 

W  ■  1,  if  processor  p  is  permitted  to  change  contents  of 
p  memory  m,  i.e.,  there  is  at  least  one  write  access  link  q 
from  p  to  m 

*  0,  if  otherwise 


0) 


mp 


-  bits/second  transfer  rate  from  processor  p  to  memory  m 
based  on  statistical  composite  of  write  access  links  for  p 
and  m. 


The  task  relationships  to  these  blocks  are  defined  as  part  of  the  real¬ 
time  constraints  in  Section  3.1.5. 

3.1.2  Processor  Utilization  and  Growth  Balance 

Given  the  mathematical  terfto  defined  in  Section  3.1.1,  the  proc¬ 
essor  utilization,  U  ,  associated  with  a  partition  may  be  expressed  as 
follows  (for  each  processor  p*l  to  P): 


U 

P 


4-  E 
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tp 


*  V 


'CP 


task  computation 
and  resource 
management  time 


+ 


E 


^p^mt^mpti 


task  input 
processing 


0t  H 

+  ^  ^to^“j  Hnp^mtoWmpto 

o*l  m 


task  output 
processing 


+ 


task  resource 
management 


An  absolute  constraint  is  that: 

Up  £  1  for  p*l  to  P. 

In  other  words  a  processor,  p,  cannot  be  store  than  100X  allocated. 

The  objective  function  for  processor  balance  may  be  written: 


P-1 

Minimize 

i-1 


E  i°i  -  °ji- 


j-i+l 


Minimize  differences 
in  processor  loads 


It  should  be  noted  that  the  presence  of  absolute  values  implies  a  non¬ 
linear  objective.  The  processor  utilization  balance  can  be  mapped  (via  a 
ranked  ordering  of  the  U  U£  such  that  U^'  J>  U.')  to  a  linear  objective 
for  a  given  partition.  **  1  ^ 

This  objective  statement  assumes  that  perfect  balance  is  the 
ultimate  or  optimal  partition.  The  candidate  design  being  considered 
may  represent  only  a  portion  of  a  bigger  design  evaluation  problem.  In 
this  case,  the  use  of  certain  processors  may  be  favored,  whereas  others 
should  not  be  considered.  To  handle  this  more  realistic  partitioning 
situation,  each  processor  has  two  additional  parameters,  which  are  user- 
specified: 


L p  -  absolute  upper  limit  for  processor  p's  utilization 


-  goal  or  target  limit  £or  processor  p's  utilization. 


With  these  additional  parameters,  the  following  constraints  apply: 


G  <  L  Goal  must  be 

p  p  less  or 

equal  to  the 
absolute  limit. 

U  </  Each  processor 

p  **  must  be  below 

its  absolute 
limit. 

The  objective  for  the  optimal  partition  in  terms  of  processor  utiliza- 
tion  becomes: 


Minimize 


P-1 

E 

i-l 


P 

E 

j-i+l 


<v<v 


(O.-G.) 


This  basically  states  that  the  processor  utilization  is  in  balance  with 
respect  to  user-specified  goals.  In  the  case  of  a  flight  trainer  soft¬ 
ware  partitioning  evaluation,  G„  could  reflect  a  percentage  that  allows 


for  future  growth, 
processor  p. 


Thus,  Gn  *  0.60  reflects  a  40Z  growth  factor  for 


The  algorithm  as  currently  designed  (Section  3.2)  assumes  that 
an  initial  feasible  solution  is  provided  by  the  candidate  design  and 
utilizes  a  heuristic  solution  based  on  the  absolute  difference  between 
the  most  heavily  loaded  processor  and  the  least  loaded,  taking  into 
account  the  goal  growth  reservation  to  distribute  the  process  load. 

3.1.3  Storage  Otilization  and  Growth  Balance 

Storage  utilization,  u^,  may  be  expressed  for  each  memory  unit, 
m"l  to  M,  as: 


£  *mbV* 
b-1 


Sum  of  blocks 
stored  divided 
by  total  memory 


1 


As  with  the  processor  balance  formulation,  storage  utilization  cannot 
exceed  the  capacity  of  the  device. 


u  £  1  for  m*l  to  M 

m 

In  addition,  storage  growth  balance  can  be  established  with  a 
respective  goal  utilization,  g  ,  and  an  absolute  limit,  4  ,  for  each 

c  I-.  IQ  ID 

memory  as  follows: 


(u.-g.)  -  (u.-g.) 


M-l 

Minimize 


M 

z 

i*l  j=i+l 


where 


u  <  4 
m  m 


for  m=l  to  M. 


As  with  the  processor  utilization,  the  solution  technique 
defined  in  Section  3.2  for  storage  utilization  is  based  on  a  heuristic 
driven  by  the  most  used  and  least  used  memory  allocations  with  respect  to 
input  goals. 


3.1.4  Development  Cost 

Software  development  costs  are  a  function  of  task  complexity  and 
programming  support  tools  available.  In  particular,  the  heterogeneous 
multiprocessor  system  adds  another  development  cost  concern,  i.e., 
coding  of  a  task  to  perform  on  more  than  one  processor  type.  A  common 
program  source  language  significantly  reduces  duplicated  coding  efforts. 
Thus,  the  development  cost  for  a  given  software  task,  t,  in  the  model  may 
be  stated  as: 


D 


t 


where 


y 


tp 


0  for  p“l 

max  |\.  _  x_ .  for  i 
I  M>t  ti 


one-time  development 

resource  manager  development 
duplicate  utilization. 


to  p-1  for  p>l 
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i 


where  . 

ipt 


1,  if  an  identical  source  language  is 
available  on  processor  i  and  p  (i  t  p)  for 
task  t 

a  technology-specified  constant  if  differ¬ 
ent  languages  are  to  be  used  (i  /p) 

0,  if  i  ■  p. 


If  the  code  already  exists, 


then  d.  =0. 

tp 


Note  that  the  multiplicative  factor  for  determining  y  can  be  stated  as 
an  equivalent  series  of  linear  constraints  because  ortne  zero-one  vari¬ 
able  x  (task  t  is  either  assigned  to  processor  p  or  it  is  not).  These 
(p-1)  constraints  are  enumerated  as  follows  for  a  given  task  t  on  proces¬ 
sor  p  (for  p  >  1) . 


vipt  xti  -  ytp  50 

\2pt  xt2  "  ytp  <  0 


X(p-l)pt  Xt(p-1)  ytp  < 


With  this  set  of  constraints,  minimizing  y  in  the  achievement  function 
ensures  that  yt  will  assume  the  appropriate  maximum  as  defined  in  the 
original  definition. 


as: 


The  goal  objective  for  software  development  cost  is  now  stated 


Minimize 


Ev 

t=i 


This  is  basically  a  problem  of  reducing  development  cost.  The  design 
attempted  to  reduce  development  cost  (Section  3.2)  to  be  less  than  a  user 
supplied  value,  V,  where  V  represents  a  ballpark  estimate  for  the  total 
software  development.  The  unit  used  may  be  man-years  or  dollars,  depend¬ 
ing  on  units  established  for  the  technology  data  base  (described  in 
Section  4.2),  which  will  be  used  to  translate  the  task  t  instruction  mix 
(Section  4.2)  to  its  one-time  development  cost  (d£  )  for  processor  p. 
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The  common  language  coefficient,  ,  is  also  a  function  of  the  tech¬ 
nology-processor-related  data  (Section  4.2)  and  the  language  factor 
selected  for  the  task. 

3.1.5  Real-Time  Task  Resource  Requirements 

The  major  constraint  areas  interact  with  the  objective  priority 
evaluations  to  further  specify  acceptable  partitioning  attributes.  As  a 
minimum,  the  following  constraints  apply  to  basic  task  resource  require¬ 
ments  and  processor  accountability: 

(a)  Each  task,  t,  must  be  assigned  to  at  least  one  processor. 

This  implies  T  constraints  of  the  following: 


P=1 


x_  >  1  for  t=l  to  T. 
tp 


(b)  If  more  than  one  processor  is  permitted  to  perform  the  same 
task,  a  resource  management  overhead  will  be  allocated  to 
task  t  processors  via  the  processor  utilization  objective 
of  Section  3.1.2.  However,  to  ensure  that  is  properly 

coupled  with  e^,  the  following  constraint  must  be  applied: 

x  .!iE2  °- 

tp  Nt 

In  addition,  constraints  must  address  task  iteration  rate  and  task  ser¬ 
vice  times  to  ensure  that  real-time  task  timing  requirements  are  met: 

(a)  Given  that  task  t  must  be  executed  Nt  times  during  the 
problem  time  period,  t,  the  task  iteration  rate  constraint 
is : 


P 

£ 

p-1 


tP 


(b)  If  overlap  of  task  t  execution  is  not  permitted  (i.e.,  t 
cannot  be  executing  on  more  than  one  processor  at  a  time), 
the  following  constraint  applies: 


(c.  4  r_  )  e_  <  minimum  (r,  IT  *S.) 

tp  tp  tp  t  t 


1 


where  is  Che  maximum  time  limit  for  one  execution  of  task 
t  • 

Note  that  if 

c_  +  r_  >  S_ 
tp  tp  t 

then  e  can  be  automatically  assigned  a  zero  value  and 
deletea  "from  consideration. 

Task  data  dependencies  must  also  be  satisfied.  These  constraints 
include: 

(a)  All  data  blocks  associated  with  task  input  must  be  availa¬ 
ble  to  the  proce8sor(s)  that  are  permitted  to  perform  the 
task.  Thus i  for  input  block  the  following  holds: 


VI 

-x.  +  as.  >0 

tp  mp  mi^ 


i*l  to  It,  t*l  to  T,  p«l  to  P. 


(b)  All  data  blocks  associated  with  task  t's  output  must  reside 
in  memory  storage  m,  which  can  be  updated  (changed)  by  any 
of  task  t's  processor(s)  p.  If  x  satisfies 


x.  +  x^  =1 
tp  tp 

then  for  a  given  task  output,  block  b“0  ,  the  following 

holds: 


as 

c  ♦  +  y 

tp  tp 


Wmp  8mb  ^b  * 


for  t"l  to  T,  o*l  to  0^,  p"l  to  P.  hL  represents  the  number 
of  different  memories^  that  have  duplicate  copies  of  block 
b;  thus,  this  constraint  requires  all  duplicate  blocks  to 
be  updated  (see  next  constraint  set). 

(c)  Any  duplicate  data  blocks  must  be  held  to  a  minimum;  there¬ 
fore  h^  may  be  thought  of  as  a  penalty  to  be  added  as  an 
additional  objective  function  with  the  following  additional 
constraint: 


>  1  (at  least  1  block  is  in  memory) 

and 


H 


"mb 

m-1 


8_v  -  -  0 


for  6  *  1  to  B. 


(d)  Input  timing  must  properly  account  for  the  number  of  task  t 
executions  on  processor  p  (e  )  for  each  task  input  block, 


Lti*  i_1  to  V 


tp 


N 


»  -  V  a  -  0 

tp  /  /  mpti 


and  a  s  ■ - ”-£tl  >  0  for  m*l  to  M 

"“P  tt'-j.  £ 

are  used  to  ensure  that  *  .  is  available  on  memory  m. 

(e)  Output  timing  must  account  for  the  nunfter  of  task  t  execu¬ 
tions  on  processor  p  (e  )  for  each  task  output  block,  o  , 
o-l  to  Ot:  p  co 


e.  -w  -  N  (1  -  s  )  <  0 

tp  mpto  t  “^to 


and 


w 

mpto 

wmp  8m0  N 

r  to  t 


2  0  for  m-1  to  M 


are  used  with  a  corresponding  achievement  function  that 

minimizes  w  „  to  ensure  that  all  duplicate  blocks  of 

.  _  .mpto  r  to 

are  updated. 
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3.1.6  Performance  Simulation  Feedback 

Sections  3.1.2  through  3.1.5  comprise  the  fundas»ntal  model 
objectives  and  constraints  that  must  be  set  in  terms  of  a  valid  static 
allocation  of  tasks.  Performance  bottlenecks  detected  by  the  simulation 
mode  being  developed  under  separate  contract  (No.  F33615-79-C-0003) 
will  add  additional  constraints  and/or  modify  coefficients.  In  particu¬ 
lar,  the  data  transfer  objective  coefficients  for  given  interfaces 
between  a  memory  and  a  processor  may  be  readjusted  to  penalize  use  of 
certain  processors  for  a  given  task  and/or  memories  for  certain  data 
block  allocations. 

A  stronger  set  of  timing  constraints  may  be  required  for  depend¬ 
ent  software  task  threads.  A  task  thread,  F^,  may  be  defined  as  a  group 
of  serially  dependent  tasks  with  the  following  notation: 


Fk  *  * fkl'  *•*’  fkG  } 
k 

where  f.  indexes  one  of  the  T  tasks.  In  general,  task  f.  must  have 
executecHC  percent  before  task  P.  can  be  enabled.  Thus^  the  tasks 
defined  as  a°thread  are  not  permittnl  to  run  simultaneously  in  parallel 
processors.  This  constraint  may  be  written  for  each  thread  k  as  follows: 

P 

E  E  (  CtP*  <Ctp  *  rtp)  *tp  +  RtP  *tP)  S  Bininum  <T  ’  Tk> 

t£Fk  p-1  '  ’ 

for  k“l  to  K,  and  represents  feedback  timing  for  thread  K.  A  further 
assumption  is  that  if  task  t  is  an  element  of  a  software  thread,  F  ,  then 
task  t  may  not  be  an  independent  task  or  an  element  of  another  task 
thread.  If  a  task  is  required  in  more  than  one  way,  it  can  be  defined  as 
a  group  of  different  tasks  for  partitioning  purposes. 

In  general,  these  threads  represent  critical  system  task  path 
flow  bottlenecks  as  determined  by  the  performance  sisnilation  of  a  given 
partition  allocation.  The  algorithm  introduces  new  or  revised  con¬ 
straints  until  one  of  the  following  conditions  exists: 

(a)  Satisfactory  solution  found 

(b)  Infeasible  condition  identified 

(c)  Maximum  feedback  iterations  performed. 

The  current  solution  state  is  to  be  saved  and/or  printed  for  future 
evaluation  as  requested  by  the  user  evaluator. 
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3.2 


ALGORITHM  DESIGN  HIGHLIGHTS 


There  are  many  mathematical  program  techniques,  including  both 
linear  and  nonlinear  optimizers  and  heuristics.  The  partitioning  model 
requires  integer  solution  values  that  immediately  classify  it  as  a  non¬ 
linear  global  optimization  problem  even  though  the  model  itself  consists 
of  linearly  expressed  objectives  and  constraints.  In  addition,  two  of 
the  three  achievement  priority  functions  (i.e.,  balance  the  processor 
load  and  balance  memory  storage)  are  nonlinear  in  their  formulation  of 
minimizing  the  sums  of  absolute  differences.  These  nonlinear  goals 
combined  with  the  goal  program  matrix,  which  is  sized  according  to  the 
parameters  represented  in  Table  1,  would  be  a  challenge  to  both  sizing 
and  timing  of  commercially  available  mixed  integer  linear  program  models 
with  a  single  achievement  priority. 

To  determine  the  viable  design  alternatives,  a  study  of  goal 
programming  was  made,  including  several  military  goal  program  applica¬ 
tions  that  have  been  implemented.  Applications  included  weapon  system 
slice  optimization  in  relation  to  planning  force  analysis  and  a  balanced 
budget  allocation  model  for  mixed  project /agency  funding.  Both  of  these 
applications  interface  goal  programming  models  with  other  analysis  tools 
(such  as  simulation,  input/output  analysis,  and  regression  analysis)  to 
provide  a  set  of  automated  operational  evaluation  tools.  These 
additional  tools  provide  a  means  to  cross-check  and  supply  detailed 
model  data  values  that  are  used  to  calibrate  the  goal  program  model.  The 
calibrated  model  is  then  used  for  selected  parametric  studies  to 
determine  impact  on  solutions  in  terms  of  parametric  margins  and 
solution  sensitivities.  Both  of  ftyese  applications  utilize  modified 
versions  of  the  classical  textbook  multiphase  goal  program  computer 
algorithms.  A  major  drawback  to  these  codes  is  their  susceptibility  to 
numeric  roundoff  error  propagation  for  problems  involving  more  than  50 
to  several  hundred  variables  and  constraints.  In  addition  to  the 
numerical  roundoff  errors,  the  multiphase  codes  studied  do  not  use 
dynamic  core  memory  management.  This  requires  the  entire  matrix  and 
associated  bookkeeping  variables  reside  in  main  memory. 

In  lieu  of  funding  the  development  for  a  mixed  integer  goal 
program  optimizer  for  larger  problems,  an  alternative  algorithm  is  the 
sequential  use  of  a  good  commercially  available  linear  program  optimizer 
interfaced  via  a  goal  program  driver  that  introduces  each  achievement 
one  at  a  time.  This  permits  continuous  solution  problems  with  up  to 
16,000  rows  to  be  handled,  given  adequate  dynamic  disc  storage.  Current 
state-of-the-art  integer  solutions  are  restricted  to  several  hundred 


Ignizio,  James  P.,  Goal  Programming  and  Extensions,  D.  C.  Heath  and 
Company,  Lexington,  Massachusetts,  1976 

Lee,  Sang  M.,  Goal  Programming  for  Decision  Analysis,  Auerbach  Pub¬ 
lishers,  Philadelphia,  Pennsylvania,  1972 
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BASIC  GOAL  PROGRAM  MATRIX  SIZING 
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variables.  The  sequential  use  of  a  linear  program  optimizer  is  the 
approach  recommended  for  further  study  in  addressing  a  subset  of  the 
software  partitioning  algorithm  as  designed  in  this  study.  The  design 
has  remained  independent  of  a  specific  computer  optimizer  code. 

Even  with  the  sequential  mixed  integer  linear  program  technique, 
the  sizing  of  the  partitioning  problem  (given  in  Table  1)  is  prone  to 
challenge  the  best  optimizers  without  some  careful  matrix  selection  gen¬ 
eration  techniques.  There  are  two  major  areas  of  concern: 

1.  The  time  consumed  in  determination  of  an  initial  feasible 
solution 

2.  Excessive  iteration  thrashing  to  determine  "optimal"  integer 
solutions. 

The  study  of  goal  programming  included  a  survey  of  heuristic  techniques 
that  can  facilitate  the  search  for  improved  solutions  given  an  initial 
feasible  solution.  In  practice,  application-customed  heuristic  algo¬ 
rithms  have  provided  an  efficient  means  for  handling  and  reducing  the 
large  solution  space  of  alternatives  to  be  searched.1 

In  the  case  of  flight  trainer  candidate  designs,  the  designers 
have  an  implied  partition  which  can  be  used  as  the  initial  solution.  The 
partitioning  problem  then  becomes  one  of  "Does  a  better  solution  exist 
with  respect  to  load  balance,  memory  balance,  and  development  cost?"  The 
incorporation  of  an  initial  solution  step  has  been  recommended  as  an 
implementation  step  requiring  further  study  for  obtaining  an  expanded 
evaluation  capability.  The  current  algorithm  design  assumes  that  an 
initial  solution  is  supplied  and  proceeds  in  a  heuristic  manner  to  seek  a 
better  solution. 

To  achieve  a  well-defined  user  evaluator  interface  of  partition¬ 
ing  input  data,  a  customed  heuristic  goal  program  driver,  and  solution 
summary  capabilities,  the  Partitioning  Algorithm  for  Software  Systems 
(PASS)  has  been  designed  emphasizing  the  four  major  processes  denoted  in 
Figure  2: 

1.  User  input  interface  and  processing  referenced  as  PASS1 

2.  Basic  partitioning  algorithm  referenced  as  PASS2 

3.  Augmented  partitioning  algorithm  (PASS3)  to  handle  dynamic 
performance  prediction  feedback 


Ignizio,  James  P.,  "Solving  Large  Scale  Problems:  A  Venture  into  a  New 
Dimension,"  Pennsylvania  State  University,  1978 
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4.  Solution  summary  reports  (PASS4)  of  a  given  partition  for 
candidate  design  i. 


Prior  to  describing  each  of  these  steps,  the  overall  design  flow  of  the 
steps  and  their  interfaces  is  presented. 

The  major  external  interface  (exclusive  of  an  optimizer)  with 
PASS  include  the  evaluation  user  and  a  multiprocessor  configuration  per¬ 
formance  predictor  simulator.  The  user  interface  considerations  for 
actual  implementation  are  expanded  in  Section  4,  with  emphasis  on 
incorporating  a  modular,  automated  data  repository  to  facilitate  input 
preparation  of  PASS1  and  maintenance  of  current  flight  trainer  design 
parameters  with  respect  to  given  partitions  (PASS4).  The  performance 
predictor  interface  is  designed  to  interact  with  the  Computational  Per¬ 
formance  Predictor  Simulator  (CPPS)  being  specified  and  designed  under 
separate  contract.  The  iterative  process  of  determining  a  new  alloca¬ 
tion  (PASS3)  based  on  performance  prediction  feedback  is  performed  until 
one  of  the  following  conditions  is  reached:  (a)  satisfactory  partition 
is  found,  (b)  design  bottleneck  is  identified,  (c)  maximum  iterations 
have  been  reached. 

3.2.1  Input  Processing  Step  PASSI 

The  mathematical  statement  of  Section  3.1  contains  software, 
hardware,  and  combined  software/hardware  parameters.  The  design  efforts 
of  this  study  have  emphasized  the  separation  of  any  combined  parameters 
into  basic  hardware  and  software  components  with  the  aid  of  technology 
data  base  tables  and  computational  formulas  necessary  to  generate  the 
given  "combined"  parameter.  Thus,  all  task/processor  and  data/memory 
parameters  are  derived  from  independent  software  and  hardware  design 
configuration  inputs  (see  Section  4.2). 

The  specific  inputs  are  defined  in  Appendix  A.  Figure  3  deline¬ 
ates  the  major  design  process  flow  for  user  input  editing  and  computa¬ 
tional  sequences  to  properly  set  up  for  the  actual  partitioning  steps 
that  follow.  The  design  demonstration  (Appendix  C)  provides  the 
detailed  computations  to  map  the  user  input  into  the  internal  partition¬ 
ing  algorithm  control  and  lookup  tables  listed  in  Table  2.  Appendix  B 
provides  representative  report  formats  for  the  user  input  echo,  which 
consists  of  the  reports  listed  in  Table  3. 

3.2.2  Basic  Partitioning  Algorithm  (PASS2) 

This  step  provides  the  basic  controls  and  logic  for  interfacing 
with  the  three  user-ordered  heuristics  to  determine  whether  an  improved 
partitioning  solution  can  be  found.  As  mentioned  in  the  introductory 
remarks  on  design  in  Section  3.2,  the  basic  assumption  is  that  an  initial 
feasible  (with  respect  to  real-time  constraints)  partition  is  supplied. 
The  resultant  basic  partitioning  algorithm  flow  is  denoted  in  Figure  4  as 
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TABLE  2.  INTERNAL  PARTITIONING  ALGORITHM  CONTROL  AND  LOOK-UP 
TABLES  ESTABLISHED  BY  PASS  1 


Limits,  Constants,  and  Codes 

Current  Problem  Sizing  Controls 

Priority  Controls 

Current  Processor  List 

Current  Memory  List 

Current  Communication  Link  List 

Current  Internal  Device  List 

Task/Processor  Allocation  and  Restrictions 

Memory/Processor  Allocation  and  Restrictions 

Block/Memory  Allocation,  Restrictions,  and 
Coefficients 

Master  Block  List 

Master  Task  List 


TABLE  3.  USER  INPUT  ECHO  REPORTS  THAT  ARE  SPECIFIED  IN  APPENDIX  B 


FORMAT* 

REPORT  TITLE 

1 

Standard  Run  Identification 

2 

Hardware  Component  Summary 

3 

Data  Block  Summary 

4 

Task  Summary 

5 

Baseline  Load  Summary 

6 

Evaluation  Options/Restrictions 

7 

Evaluation  Priorities 

8 

Basic  Partitioning  Problem  Size 

*  Format  reference  to  Appendix  B 
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being  comprised  of  initial  solution  verification,  heuristic  control 
table  setup,  and  user-specified,  priority-ordered  heuristic  executions. 

There  are  three  basic  heuristic  algorithms  corresponding  to  the 
three  objectives  or  achievement  functions:  processor  utilization 

(LOADBL),  memory  utilization  (MEMBAL),  and  development  cost  (RDCOST). 
Figure  5  denotes  the  major  selection  branch  as  being  a  function  of  the 
user-specified  priority  execution  order  GOAL  (g),  where  g  is  the  current 
priority  level  being  executed.  Prior  to  invoking  the  appropriate  heu¬ 
ristic,  a  test  is  made  to  determine  whether  the  basic  priority  goal  level 
has  already  been  achieved.  If  so,  a  return  is  made  to  proceed  to  the 
next  priority  level. 

The  major  features  incorporated,  in  the  design  permit  ranking  of 
the  current  partition  solution  variables  with  respect  to  impact  on  the 
given  priority  under  consideration.  The  following  ranking  definitions 
are  utilized  for  each  of  the  respective  heuristics: 

1.  For  the  load  balance  heuristic,  processor  p's  utilization, 

U  ,  is  subtracted  from  its  goal,  G  ,  to  define  U'  *  G  -  U  . 
Tne  resultant  U*  array  is  then  ranfced  from  high  tB  lowPvaluis 
(i.e.,  those  below  their  goal  to  those  above  their  respec¬ 
tive  goal  in  order  of  difference  magnitude).  The  resultant 
ranked  array  is  then  used  to  determine  whether  the  load  is 
currently  in  balance,  i.e.,  (U'  -  U'  GTOLPU)  with  respect 

to  a  user-supplied  tolerance  (GTOLPU)  Tor  processor  utiliza¬ 
tion.  The  object  is  to  offload  some  of  the  tasks  from  the 
heavily  utilized  processors  to  the  lighter  loaded  processors 
to  obtain  a  better  balance,  as  denoted  in  Figure  6. 

2.  For  the  memory  balance  heuristic,  the  allocated  memory,  u  , 
is  subtracted  from  its  goal  allocation,  g  ,  to  define  u'  * 

“um*  The  resultant  array,  u'  ,  is  ?hen  ranked  (in  a 
similar  fashion  as  processor  utilization)  to  determine 
whether  the  current  memory  allocation  is  in  balance  accord¬ 
ing  to  the  user-supplied  goal  (u'j  -u'M  GTOLMU).  The 
objective  (Figure  7)  is  to  reallocate  some  of  the  blocks 
from  the  over-allocated  memories  to  the  under-allocated 
memories  to  obtain  a  better  balance. 

3.  ■  The  development  cost  is  a  minimization  problem  of  individual 

task  development  cost.  Thus,  the  tasks  are  ranked  from  most 
expensive  to  least  expensive.  The  ranked  cost  array  can 
then  be  systematically  processed  (Figure  8)  to  determine 
whether  a  more  cost-effective  solution  is  possible  (i.e., 
can  this  task  be  implemented  on  another  processor  in  the 
candidate  configuration  of  less  development  expense  and 
still  meet  real-time  constraints?).  It  should  be  noted  that 
this  priority  is  only  applicable  to  a  heterogeneous  set  of 
candidate  processors. 


I2ATIOM 


Priority  heuristic  selection  process 


ISO 


$  >"  |b 


A *(«««« 


J 

IMTlal  V- 

j* 


-■© 


Atoni  mocciioii 

AtC  MOVC  LIMIT 


MOCtll  t ACM  MOCKIIOI 
MOVt  At SOLUTE  LIMIT 


I  t 

NtLT  O 


VN I CM  AtC 
LttS  IMAM 
Ot  ttWAL  IOM 


_J~Jic®rctI  iiit-Lo  i  V 

J»«llf»MWIt»| 

I 


l|  PACT** 

HlJ.lJT.  IJiUf 


pu¬ 
ll  It AMRO 
IflCPPCP. COPCO I . I  ,J|) 


L«*s!5W»HR* _ 3 


r-:j 

CprCiJ.D-*- 

Blfi  1|  Armtn 

«»•«  II  ALtOCAT  IO« 

"«»  ovrnuwfD 
j  rnctiam  4 

— a> — <n 


A  rot  t ACM 

PtOCt  tlOt  K 
MLM  HO 
•UAL  LI VC L 
Ml  If  IT 
can  pctrotn 
tone  or 
rtoctotot  I'l 

IAHI 


AiMMcctiarwi 


A 

Figure  6.  Processor  load  balance  heuristic. 
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Processor  load  balance  heuristic  (Concluded). 
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Figure  7.  Memory  allocation  balance  heuristic 
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Figure  7.  Memory  allocation  balance  heuristic  (Sheet  2  of  2) 


’  For  each  of  the  heuristics,  checks  are  incorporated  to  ensure 

that  real-time  limitations  are  not  violated  by  any  subsequent  new 
"improved"  solutions  found  by  the  respective  heuristic.  Design  emphasis 
was  placed  on  the  order  for  incorporating  these  checks  within  the  heu¬ 
ristic  procedure  to  avoid  excessive  calculations  when  easily  determined 
restrictions  would  prohibit  exploring  a  given  tradeoff.  For  example, 
when  attempting  to  reallocate  a  task  to  another  processor,  only  those 
processors  that  may  perform  the  task  are  considered.  To  solve  some  of 
the  more  complex  interrelated  real-time  constraints,  a  linear  program 
statement  might  be  studied  to  determine  whether  effective  utilization  of 
an  optimizer  would  be  feasible  for  performing  the  given  tradeoff.  The 
current  algorithm  incorporates  a  specific  check  of  constraints  as  formu¬ 
lated  in  Sections  3.1.5  and  3.1.6. 

The  heuristic  driver  continues  at  each  priority  level  until  it 
has  exhausted  its  systematic  exchange  tradeoff  search  for  an  improved 
measurement.  The  three  priority  levels  are  executed  in  the  order  as 
specified  by  the  user  evaluation  priority  inputs  of  PASS1.  The  basic 
computational  and  logical  sequence  flows  for  each  of  the  three  priority 
levels  are  denoted  in  Figures  6,  7,  and  8,  respectively. 

3.2.3  Augmented  Partitioning  Algorithm  (PASS3) 

This  step  is  an  expansion  of  the  PASS2  processes  with  emphasis  on 
resolving  identified  performance  bottlenecks  of  the  following  types: 

1.  Cycle  or  thread  timing  is  not  sufficient  for  real-time 
system  response. 

2.  Specific  candidate  component  (i.e.,  processor,  memory,  com¬ 
munication  link)  utilization  is  unacceptable. 

The  basic  process  decision  flow  is  depicted  in  Figure  9. 

Recognizing  that  manual  user  evaluation  insight  may  help  expe¬ 
dite  the  search  for  an  improved  partitioning,  process  PA  3100  facili¬ 
tates  the  option  that  the  current  allocation  can  be  manually  modified. 
Once  any  modifications  have  been  processed,  the  performance  data  are 
processed  via  PA  3200  to  readjust  coefficients  and  to  set  up  additional 
constraint  generation  controls.  The  new  constraints  are  then  con¬ 
structed  and  their  basic  impact  on  the  current  partition  is  assessed  in 
terms  of  solution  feasibility.  Each  performance  bottleneck  is  processed 
individually,  in  a  predetermined  order  of  criticality  during  this  pro¬ 
cess  (PA  3300). 

If  a  cycle  or  thread  is  the  bottleneck,  then  the  respective  re¬ 
source  management  and  data  communication  links  are  examined  to  determine 
the  major  bottleneck  within  the  thread  or  cycle.  Penalty  coefficient 
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Figure  9.  Augmented  partitioning  algorithm  flow. 


adjustments  are  made  to  the  processor  utilization  equation.  An  alter¬ 
nate  partition  is  sought  that  satisfies  the  end-to-end  time  requirement 
of  the  given  cycle  or  thread  under  these  more  stringent  constraints. 

If  a  component  is  above  its  allotted  utilization,  a  check  as  to 
processor  or  memory  balance  bottleneck  is  made.  If  it  is  a  processor, 
the  processor  heuristic  is  used  to  offload  the  offending  processor.  If 
it  is  a  memory  problem,  an  attempt  is  made  to  find  a  faster  access  memory 
or  add  a  duplicate  block  if  shared  memory  access  is  the  bottleneck. 

As  the  processing  of  bottlenecks  is  performed,  the  augmented 
heuristic  driver  invokes  PASS2  partitioning  modules  interspersed  with 
additional  checks  for  maintaining  the  appropriate  thread  and/or  cycle 
constraints.  If  a  new  partition  is  found  to  be  acceptable,  it  is  saved 
for  feedback  to  the  performance  simulation  and  further  manual  analysis. 
If  not,  the  problems  are  identified  for  user  evaluation.  Appendix  D 
contains  the  detailed  design  flows  necessary  to  fully  enumerate  the 
algorithm.  Additional  changes  are  anticipated  as  the  details  of  the 
performance  simulator  design  are  enumerated  under  Contract  No.  F33615- 
79-C-0003. 


3.2.4  Solution  Summary  Reports  (PASS4) 

The  report  generation  features  of  PASS4  are  designed  to  provide 
printed  summaries  of  a  partition  found  by  either  PASS2  or  PASS3  for  a 
given  candidate  configuration.  The  specific  formats  chosen  present  the 
partition  solution  from  five  complementary,  but  different,  aspects, 
including  (a)  partitioning  priority  level  measurements,  (b)  task  alloca¬ 
tions,  (c)  data  block  allocations,  (d)  processor  allocations,  (e)  memory 
allocations. 

Figure  10  reflects  a  modular  design  flow  based  on  user  requests 
for  any  of  the  reports  for  a  given  partition  j  of  candidate  configuration 
i.  This  particular  report  generation  capability  should  be  implemented 
for  access  from  batch  job  control,  special  user  codes,  as  well  as  inter¬ 
active  displays  to  obtain  maximum  evaluation  flexibility  to  automati¬ 
cally  recall  and/or  print  alternative  partition  solutions  for  a  given 
candidate. 

Specific  output  report  formats  are  presented  in  Appendix  B.  The 
design  demonstration,  Appendix  C,  has  sample  output  reports  for  user 
reference. 


3.3  FEASIBILITY  DEMONSTRATION 

In  deriving  a  meaningful,  yet  simple,  sample  problem,  specific 
preliminary  design  material  was  obtained  from  Williams  AFB  with  regard 
to  an  ongoing  expanded  design  for  the  Advanced  Simulation  for  Pilot 
Training  multiple  processor  visual  computational  support  subsystem.  The 
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Figure  10.  Report  generator  design 


preliminary  design  material  provided  a  realistic  source  of  the  format 
for  ongoing  trainer  computational  design  input.  It  also  included  a  mix  of 
general-purpose  and  special-purpose  processors.  The  information  in  this 
memorandum  provided  a  good  base  for  generating  a  sample  problem;  how¬ 
ever,  the  resulting  sample  problem  required  simplification  of  the  con¬ 
figuration  described  to  permit  a  flexible,  yet  easy-to-follow,  manual 
demonstration  problem  to  be  obtained. 

The  design  factors  in  the  original  problem  were  very  restrictive 
as  to  Central  Processing  Unit  (CPU)  task  assignments  and  thus  left  very 
little  room  for  alternative  partitioning.  This  reinforces  the  fact 
that,  in  software  design,  tasks  tend  to  be  defined  in  terms  of  the 
selected  hardware  configuration  features  to  meet  computational  needs,  as 
opposed  to  specifying  application  computations  and  then  matching  tasks 
to  the  hardware  selection.  For  the  partitioning  algorithm  to  be  applica¬ 
ble  to  alternative  allocations  and  partitions,  the  major  feasibility 
issue  concerns  design  language  and  means  for  inputting  the  problem 
definition  from  which  the  partitioning  model  is  to  operate.  These  issues 
are  discussed  in  Section  4. 

For  demonstration  purposes,  overview  inputs,  restrictive  inputs, 
and  detailed  inputs  have  been  incorporated  to  illustrate  various  aspects 
and  paths  of  the  partitioning  process  and  to  point  out  the  tradeoffs  in 
utilization  of  detailed  inputs  versus  general  estimates.  The  complete 
algorithm  feasibility  demonstration  is  included  as  Appendix  C  to  this 
report.  The  basic  order  is  the  sample  problem  definition,  user  input 
sheets,  user  input  echo  summary,  basic  partitioning  priority  calcula¬ 
tions,  sample  performance  feedback  contingencies,  and  solution  summary 
outputs. 


Figures  11  through  13  illustrate  the  major  partitioning  compon¬ 
ents  as  extracted  and  simplified  from  a  set  of  Williams  AFB  ASPT  prelimi¬ 
nary  design  notes  for  the  visual  subsystem.  The  overall  processor  con¬ 
figuration  is  denoted  in  Figure  11.  The  memory  and  external  communica¬ 
tions  are  illustrated  in  Figure  12  to  include  both  private  and  shared 
memory  devices.  It  also  includes  processor-to-processor  direct  data 
transfer.  Figure  13  denotes  the  simplified  task  flow  used  for  demon¬ 
strating  the  input  and  output  steps  of  the  algorithm.  The  tasks  of 
Figure  13  may  be  further  divided  into  more  detailed  tasks  for  demonstrat¬ 
ing  and  testing  specific  features  of  the  partitioning  algorithm,  once  an 
automated  version  of  the  algorithm  is  implemented. 

The  sample  demonstration  (delineated  in  Appendix  C)  permits  the 
definition  of  potential  automated  implementation  processes  for  handling 
real-world  partitioning  problems.  The  examples  demonstrate  the  feasi¬ 
bility  of  an  automated  tool.  Section  4  provides  recommended  implementa¬ 
tion  steps  for  verifying  and  validating  the  partitioning  tool.  These 
steps  will  require  that  the  basic  algorithm  be  automated  to  properly 
evaluate  and  demonstrate  its  performance  characteristics  for  more  rea- 
listic  partitioning  problems  that  tend  to  be  of  larger  size  than  the 
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manual  demonstration  examples.  The  manual  examples  will  permit  the 
basic  logic  to  be  verified  for  a  controlled,  small-scale  application 
prior  to  "cranking  out"  large-scale  partitioning  problems.  This  will 
permit  an  initial  level  of  confidence  to  be  established  in  the  automated 


version. 


4.  MODEL  IMPLEMENTATION  CONSIDERATIONS 


To  successfully  implement  the  software  partitioning  algorithm, 
an  up-to-date  technology  data  base  for  the  flight  training  simulator 
computational  devices  is  essential.  This  section  delineates  the  data 
collection  process  and  decision  steps  recommended  for  potential  automa¬ 
tion  and  quality  control  of  the  algorithm  defined  in  Section  3.  This 
section  has  been  organized  to  go  from  an  overview  of  the  candidate  design 
evaluation  environment  into  a  detailed  evaluation  support  data  base 
repository  description,  followed  by  computer  selection  criteria  and  the 
recommended  implementation  schedule  for  automation  of  the  software 
partitioning  evaluation  algorithm. 


4.1  FLIGHT  TRAINING  SIMULATOR  EVALUATION  ENVIRONMENT 


Typically,  the  development  of  flight  training  simulator  candi¬ 
date  designs  for  the  Air  Force  are  contracted  out  by  the  Simulation 
System  Program  Office  (ASD-SD24).  The  computational  subsystem  design 
development  is  monitored  and  evaluated  by  the  Deputy  of  Engineering 
Simulation  (ASD-EN).  In  some  cases,  the  flight  trainer  development  is 
directly  contracted  by  a  specific  system  office  (such  as  in  the  case  of 
the  F-16  trainer).  Currently,  the  contracted  organization  has  the  pri¬ 
mary  responsibility  for  establishing  both  hardware  and  software  require¬ 
ments  of  the  computational  system,  subject  to  certain  Air  Force  guide¬ 
lines  and  training  capability  objectives.  The  candidate  design  evolves 
through  an  iterative  refinement  of  documentation  and  algorithm  enumera¬ 
tion  analysis,  which  typically  progresses  from  system  specification 
functional  flows  followed  by  the  detailed  enumeration  of  the  candidate 
design.  Each  of  these  levels  has  narrative  descriptions  interspersed 
with  a  variety  of  technical  charts,  drawings,  tables,  flow  diagrams, 
interface  definitions,  etc.;  however,  as  denoted  in  Figure  14,  the 
volume  of  documentation  for  a  training  simulator  quickly  becomes 
unwieldy  unless  documentation  traceability  and  content  standards  are 
adhered  to  and  enforced  via  constructive  reviews,  which  are  geared  to 
detecting  and  correcting  errors  early  in  the  development  phase. 

This  effort  has  specifically  addressed  the  software  partitioning 
aspects  of  candidate  design  evaluation.  The  three  major  outputs  of  the 
partitioning  algorithm  are  measures  of  the  processing  load  balance, 
memory  utilization,  and  estimated  development  implementation  cost  based 
on  given  timing  and  sizing  input  requirements  of  the  respective  tasks  and 
data  load  for  a  given  candidate  configuration.  For  effective  use  of  the 
software  partitioning  algorithm,  the  underlying  mathematical  model  of 
Section  3.1  must  be  understood  in  terms  of  the  processor  utilization, 
memory  utilization,  and  development  cost  formulations,  which  are  the 
primary  outputs. 
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Figure  14.  Hierarchy  of  flight  trainer  documents,  which  relates  to  candi¬ 
date  design  evaluation,  can  quickly  become  unwieldy  If  content 
and  traceability  standards  are  not  adhered  to  or  enforced. 
The  simulator  computational  subsystem  interfaces  with  and 
coordinates  a  large  number  of  the  trainer  simulator  sub¬ 
systems. 
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To  obtain  reliable  outputs,  a  consistent,  systematic  procedure 
needs  to  be  established  with  appropriate  configuration  management  and 
quality  assurance  provisions  and  controls.  The  major  implementation 
consideration  for  such  a  procedure  is  the  establishment  of  a  consistent 
data  repository  for  pertinent  flight  trainer  computational  design  data. 
No  central  repository  for  Air  Force  flight  trainer  computational  designs 
currently  exists,  although  various  organizations  (such  as  ASD-EN)  do 
have  their  own  evaluation  data  repositories. 

During  the  course  of  this  contract,  it  was  learned  that  the  Naval 
Training  Equipment  Center  (NTEC)  in  Orlando,  Florida,  does  have  a 
repository  of  all  documentation  associated  with  Navy  training  devices  to 
include  the  computational  subsystem.  NTEC  recently  modified  the 
required  Data  Item  Descriptions  related  to  the  computational  subsystem 
to  be  an  integral  part  of  training  device  developstent  in  conjunction  with 
a  proposed  Appendix  A  to  the  trainer  specification,  MIL-STD-1644, 
entitled  "Trainer  Software  Design,  Control,  Production  Testing  and 
Acceptance  Procedures  and  Requirements."  This  proposed  specification 
incorporates  the  top-down  structured  design  approach  with  minimum  stand¬ 
ards  that  are  required  of  each  milestone  document  and  its  associated 
review  content,  error  detection/correction  actions,  and  milestone  com¬ 
pleteness  determination.  The  procedures  are  in  basic  agreement  with  the 
development  cycle  presented  in  Section  2.1.  This  set  of  documents  per¬ 
mits  a  consistent  repository  to  be  established  and  maintained  for  cur¬ 
rent  reference  and  analysis  input  for  new  development  considerations. 
Unfortunately,  it  is  still  primarily  a  manual  infonsation  storage  and 
retrieval  system  when  it  comes  to  accessing  data  pertinent  to  software 
partitioning. 

The  factors  identified  in  Section  3.1  that  influence  optimal 
software  allocation  (such  as:  data  block,  task,  processor,  and  memory 
descriptions)  remain  the  same  regardless  of  the  system  assumptions  or 
presentation  format.  Indeed,  these  factors  (Table  4)  must  typically  be 
extracted  from  more  than  one  document  to  obtain  the  complete  set  of  input 
and  constraint  parameters  defined  in  the  suthematical  statement  of 
Section  3.1.  To  assist  in  the  review  of  documents  with  respect  to 
software  partitioning  of  the  computational  subsystem,  the  supporting 
data  base  parameters  have  been  segmented  into  five  major  areas  with 
respect  to  flight  trainer  simulator: 

1.  Trainer  Computational  Interface  Requirements 

2.  Baseline  Application  Components 

3.  Candidate  Hardware  Configuration  Components 

4.  Technology  Data  Base 

5.  Evaluation  Criteria/Constraints  and  Partitioning  Load. 

Figure  15  reflects  the  interactive  nature  of  these  data  base  areas  with 
respect  to  technology  capabilities  and  the  development  cycle  up  through 
the  completion  of  the  design  but  prior  to  actual  implementation  and  test¬ 
ing.  The  upper  area  relates  to  milestone  documents  of  the  training 


TABLE  4.  DEVELOPMENT  DOCUMENTS  AND  THEIR 
RELATIONSHIP  TO  THE  PARTITIONING 
ALGORITHM  FOR  SOFTWARE  SYSTEMS 


DOCUMENT(S)  INPUT  AREA 


Computational  Subsystem  External  Device  Interfaces 

Interface  Specification 

Required  Components 

Functional  I/O  Map 

Conmunication  Rules 
and  Priorities 

Baseline  Load(s) 


Software  Design  and  Data  Block  Descriptions 

Data  Base  Specifications 

Task  Descriptions 
Task  Threads 

Baseline  Load(s)  Tasking 


Hardware  Configuration  Processors 

Design  Specifications 

Memories 

Interfaces  (Internal  and 
External) 


Coimiun  i  cat  ion  Rules 
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Figure  15.  Computational  design  evaluation  must  relate  a  specific  design 
In  terms  of  current  technology  capabilities  for  both  external 
comfiunl  cat  Ions  and  Internal  computational  subsystem  details. 
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computational  interface  requirements,  software  design,  and  hardware 
design  respectively.  The  lower  half  represents  the  technology  data 
base,  which  permits  an  abbreviated  means  for  entering  the  design  details 
on  which  the  partitioning  algorithm  is  to  operate.  The  left  half  relates 
the  devices  to  be  serviced  by  the  computational  subsystem,  and  the  right 
half  reflects  the  internal  computational  subsystem  structure  organiza¬ 
tion  and  devices. 

Although  the  data  are  extracted  from  independent  sources,  it  re¬ 
quires  interactive  coordination  and  configuration  controls  to  ensure 
that  accurate,  up-to-date,  best  estimates  are  utilized  for  the  evalua¬ 
tion  at  hand.  The  evaluation  criteria  and  constraint  inputs  facilitate 
configuration  controls,  parametric  analysis,  and  partitioning  flexi¬ 
bility  with  respect  to  prohibited  and/or  preassigned  allocations  in 
addition  to  initial  allocations.  The  details  of  this  segmented  data  base 
are  now  described  in  terms  of  implementation  considerations. 


4.2  DATA  BASE  MANAGEMENT 


Two  major  recommendations  are  being  made  to  facilitate  orderly 
consolidation  of  the  storage  and  retrieval  for  each  of  the  five  data  base 
areas  that  provide  the  driving  source  of  information  for  the  partition¬ 
ing  algorithm  and  candidate  design  evaluation  process.  These  recom- 
mendations  are  as  follows: 

1.  The  addition  of  a  standard  set  of  candidate  design  specifi¬ 
cation  tables  that  address  the  software  and  hardware  designs 
as  independent  sets  of  parametric  measures. 

2.  The  establishment  of  a  design  evaluation  data  base  reposi¬ 
tory  utilizing  an  interactive  file  management  system  under 
the  configuration  control  of  ASD/ENETC. 

This  subsection  supplies  key  factors  that  should  be  evaluated  and  modi¬ 
fied  as  necessary  to  facilitate  an  orderly  transition  to  an  automated 
algorithm  implementation  as  presented  in  Section  4.4.  Proper  utiliza¬ 
tion  will  require  a  training  indoctrination  as  to  the  potential  benefits 
to  both  the  flight  trainer  developer  and  evaluator  communities.  Before 
the  recommended  input  forms  are  described,  several  master  data  struc¬ 
tures  are  delineated  that  have  a  direct  influence  on  validity  of  data 
entries  and  provide  the  key  to  independent  software  and  hardware  design 
char ac  terization. 

4.2.1  Master  Data  Structures 


These  master  structures  include  (a)  data  block  characterisation, 
(b)  memory  characterisation,  (c)  task  characterization,  and  (d)  proces¬ 
sor  characterisation. 


Combinations  of  these  structures  are  incorporated  into  the 
recommended  forms  for  each  of  the  five  data  base  input  areas  presented  in 
Appendix  A. 

4. 2. 1.1  Data  Block  Characterisation  -  Data  characteristics  such 
as  source,  volume,  frequency,  content,  and  destination  are  the  real-time 
drivers  of  the  computational  subsystem  from  both  external  device  and  in¬ 
ternal  task  communications,  command,  and  control.  Table  5  denotes  at¬ 
tributes  required  by  the  software  partitioning  algorithm  for  each  data 
block  that  is  acted  upon  or  created  by  the  computational  subsystems  being 
partitioned.  Note  that  these  attributes  do  not  tie  the  data  block  to  a 
specific  storage  device.  Only  external  system  blocks  are  identified  as 
being  related  to  a  given  type  of  peripheral  interface;  for  example,  a 
cockpit  control  setting  input  buffer  block  has  a  definite  source  device 
that  must  be  monitored  at  a  predetermined  sample  rate.  On  the  other 
hand,  the  data  to  be  confuted  by  one  task  and  used  by  a  sequentially 
dependent  task  are  described  in  terms  of  minimum  storage  device  require¬ 
ments  for  their  storage  and  retrieval  utilisation.  These  master  block 
definitions  are  then  referenced  by  the  block  identification  when  refer¬ 
enced  in  the  task  descriptions  (Section  4. 2. 1.2)  or  in  evaluation  allo¬ 
cation  restrictions  (Section  4.2.3). 

4. 2. 1.2  Memory  Characterisation  -  A  wide  variety  of  memories 
may  be  incorporated  into  a  candidate  design  configuration  for  a  flight 
trainer.  For  purposes  of  partitioning,  awmories  are  categorised  (as  de¬ 
noted  in  Table  6)  to  include  read-only  memory  (RDM),  writable  control 
stores  (WCS),  main  random  access  memory  (RAM),  rotating  random  access 
mem  ory  (RRAM),  and  sequential  awsuries  (SM).  Within  each  of  these 
categories  are  additional  retrieval  and  storage  characteristics  for  data 
representations  of  addressable  units.  These  representations  permit  the 
generic  data  block  paraawters  of  Section  4. 2. 1.1  to  be  matched  with 
appropriate  memory  devices  in  the  candidate  configuration  for  which 
partitioning  is  being  performed. 

4. 2. 1.3  Task  Characterisation  -  Specification  of  task  attri¬ 
butes,  which  are  independent  of  the  processing  hardware,  poses  a  very 
challenging  problem  area  for  incorporating  the  traditional  hardware- 
dependent  design  customs  and  notations  that  have  evolved  not  only  in 
flight  training  sianilator  design  but  computational  system  designs  in 
general.  At  this  point  in  software  design  history,  several  emerging 
philosophies  for  design  standards  seem  to  be  contradictory  concerning 
the  level  of  specification  and  the  documentation  language  used  to  convey 
the  detailed  software  algorithms  to  be  implesmnted.  At  one  extreme  is 
the  use  of  English-like  structured  pseudo  code,  which  is  favored  for  its 
features  of  being  easy  to  follow  and  comprehend.  On  the  other  hand, 
there  is  an  emphasis  for  precise,  unambiguous  mathematically  enumerated 
representations  that  provide  the  specific  computations  but,  if  not 
annotated  with  English  descriptions,  they  become  very  hard  to  follow, 
except  for  persons  who  are  very  familiar  with  the  specifics  of  the 
algorithm.  Most  designs  are  generally  a  mixture  of  these  two  approaches, 


TABLE  5 


DATA  BLOCK  CHARACTERIZATION 


ATTRIBUTE 

VALUES 

UNIT/MEANING 

Identifier 

6-Character 

Provides  a  unique  identifier 

Mnemonic 

for  cross-reference  and 

Level 

1  Character 

labeling  purposes 

*  ‘S' 

System  Interface 

*  ’G’ 

Global  {used  by  more  than 

*  *L* 

one  task) 

Local  to  one  task  but  must 

«  *T' 

be  saved 

Temporary  scratch  area  for 

Discipline 

4-Character  Code 

a  given  task 

Provides  basic  I/O  requirement 

*  'FIFO* 

for  determining  suitable 
memory  device  allocation 

Queue 

*  'LIFO' 

Stack 

-  'SEQ' 

«  'RAN* 

Sequential 

Random 

*  'ROR' 

Ready-Only  Random 

»  “ROS' 

Ready-Only  Sequential 

»  1 CBUF ' 

Circular  Buffer 

Sizing 

e  Maximum  Records 

Positive  Integer 

Records 

t  Bits/Charac- 

Positive  Integer 

Bits 

ter 

•  Characters/Word 

Positive  Integer 

Bytes 

•  Average  Words/ 

Positive  Integer 

Words 

Record 

•  Maximum  Words/ 

Record 

•  Minimum  Words/ 

Positive  Integer 

Words 

Positive  Integer 

Words 

Record 

TABLE  6.  MEMORY  DEVICE  CHARACTERIZATION 


ATTRIBUTE 

VALUES 

UNIT/MEANINS 

Identifier 

10-Character 

Provides  a  unique  identification 

Mnemonic 

for  each  memory  device  in  the 

Type 

4  Characters 

technology  data  base  for  which 
the  following  attributes  define 

*  'ROM' 

Read  Only  Memory 

*  'RAMM' 

Random  Access  Main  Memory 

»  ‘RRAM‘ 

Rotating  Random  Access  Memory 

*  ’  SM* 

Sequential  Memory 

*  'WCS' 

Writable  Control  Store 

Size  in  Bits 

•  Minimum 

Positive  Integer 

Bits 

•  Maximum 

Positive  Integer 

Bits 

e  Increments 

Positive  Integer 

Bits 

Number  of 

Positive  Integer 

Different 
Addressable  Units 

For  Each 
Addressable  Unit 

e  Level 

4-Character  Code 

»  'BIT' 

Bit  Addressable 

*  ■ 6BB ’ 

6-B1 t  Byte  Addressable 

*  ' 8B8 1 

8-B1t  Byte  Addressable 

*  'WORD' 

Word  Addressable 

•  Bits/Unit 

Positive  Integer 

Exclusive  of  Parity  or  Error 

Level 

Deletion  Correction  Bits 

e  Read  Access 

Real 

Nanoseconds 

Time 

e  Read  Cycle 

Real 

Nanoseconds 

Time  Unit 

e  Maximum 

Positive  Integer 

Same  as  Unit  Level 

Sequential 
Units  Trans* 
ferred  for 
Single  Read 

e  Write  Access 

Real 

Nanoseconds 

T  ime 

63 


TABLE  6.  MEMORY  DEVICE  CHARACTERIZATION  (Sheet  2  of  2) 


ATTRIBUTE 

VALUES 

UNIT/MEANING 

•  Write  Cycle 
Tiae/Unit 

Real 

Nanoseconds 

e  Maximum 

Sequential 

Units  for 

Single  Write 
Access 

Positive  Integer 

Same  as  Unit  Level 

e  Error  Oetection/ 
Correction 

6-Character  Code 

«  ’PARITY' 

*  ’ SECOEO  * 

Parity  Bit 

Single  Btt  Error  Correction 
Double  Bit  Error  Detection 

Number  of  Sup¬ 
pliers  for  Each 
Supplier 

Positive  Integer 

e  Identifier 

10  Characters 

Unique  Identifier 

e  MTBF 

Real 

Hours  -  Mean  Time  Between 
Failures 

e  MTTR 

Real 

Hours  -  Mean  Time  to  Repair 

e  MSPM 

Real 

Hours  -  Rescheduled  Preventive 
Maintenance 

e  MTPM 

Real 

Hours  -  Mean  Time  fur  Preven¬ 
tive  Maintenance 

which  facilitates  the  overall  functional  flow,  high-level  presentation 
and  permits  a  traceability  structure  for  enumeration  of  detailed  design 
computations  and  decision  logic. 

The  remaining  problem  area  of  design  specification  relates  to 
the  specific  notation.  Certain  aspects  of  flight  trainer  computational 
algorithms  have  become  well-defined,  i.e.,  aircraft  flight  kinematics. 
These  algorithms  are  generally  used  for  making  benchmarks  on  new  candi¬ 
date  processors.  Thus,  for  well-established  algorithms,  a  master  set  of 
simulation  task  benchmarks  can  be  established  for  each  candidate  proces¬ 
sor  being  considered.  New  algorithms  require  a  more  fundamental  break¬ 
out  of  the  instruction  mix  to  ascertain  timing  and  sizing  elements.  In 
summary,  a  master  set  of  software  task  attributes  are  presented  in 
Table  7.  The  establishment  of  a  master  instruction  mix,  task  I/O 
descriptors,  and  task  enablement  features  is  recommended  as  one  of  the 
steps  (Section  4.4)  toward  algorithm  implementation.  Related  to  this 
master  instruction  mix  is  the  development  language  for  task  code  genera¬ 
tion.  Recent  trends  in  simulator  coding  have  incorporated  FORTRAN  code 
for  the  scientific  mathematical  application  models,  but  there  is  still  a 
strong  dependence  on  the  assembly  level  code  for  expressing  real-time 
executive  and  I/O  handler  modules  to  meet  the  real-time  timing  require¬ 
ments.  The  selection  of  a  task  design  instruction  mix  notation  should  be 
coordinated  with  the  simulation  high-order  language  efforts  and  proces¬ 
sor  instruction  architectures. 

One  way  to  obtain  this  information  would  be  the  use  of  a  graphi¬ 
cal  task  flow  representation,  which  included  a  standard  design  notation 
to  indicate  the  instruction  sequences,  loops,  and  relationships  with 
I/O.  A  flow  notation,  such  as  TBE’s  Input/Output  Relationships  and 
Timing  Diagrams,  can  be  automatically  traversed  with  the  instruction  mix 
and  I/O  features  being  identified  and  reformatted  for  use  with  the  parti¬ 
tioning  algorithm.  This  would  require  that  a  standard  flight  trainer 
computational  design  language  and  flow  representations  be  established, 
thus  providing  a  standardized  way  for  documenting  the  detailed  task 
computational  designs. 

An  important  note  is  made  here  regarding  the  traditional  means 
of  expressing  task  sizing  and  timing  in  terms  of  adds,  multiplies, 
branches,  etc.  The  instruction  mix  need  not  be  at  the  machine  level. 
Instead,  it  should  reflect  a  set  of  simulation  macros,  such  as  single 
variable  linear  table  interpolation,  and  trigonometric  functions.  Each 
of  these,  in  turn,  is  characterized  for  each  candidate  processor  as  to 
timing  and  sizing.  If  the  simulation  macro  has  been  implemented  in 
firmware  or  as  part  of  a  mathematical  package,  the  sizing  is  reduced  in 
terms  of  the  main  instruction  storage  for  the  task. 

4. 2. 1.4  Processor  Characterization  -  Processor  technology  is 
constantly  expanding  in  terms  of  operating  system  and  instruction  set 
capabilities.  Table  8  lists  processor  attributes  that  pertain  directly 
to  the  software  partitioning  algorithm.  The  operating  system  features 
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TABLE  7.  TASK  CHARACTERIZATION 


ATTRIBUTE 

VALUES 

UNIT/MEAINING 

Identifier 

6-Character 

Mnemon 1 c 

Provides  a  unique  identifier 
for  cross-reference  and 
labeling  purposes 

Source  Language 

10-Character 

Code 

Must  match  entry  in  the 
master  source  language 
list  maintained  for  current 
processor  technology 

Instruction  Mix 
for  Each  Instruction 
Type: 

•  Instruction  Iden¬ 
tifier 

10-Character 

Code 

Must  match  entry  in  master 
simulator  instruction  mix 
identifiers 

•  Sizing  Count 

Positive  Integer 

Number  of  times  this  instruc¬ 
tion  appears  in  code 

•  Execution  Count 
Average 

Horst  Case 

Positive  Integer 
Positive  Integer 

Number  of  instruction  inter¬ 
actions  considering  looping 
conditions  for  average  and 
worst-case  logic 

Data  Retrieval  for 

Each  Task  Input 

•  Block  Identifier 

6  Characters 

See  Table  5 

•  When 

6-Character  Code 

»  'START' 

All  records  read  at  first 
of  task  before  main  proces¬ 
sing 

»  'ALONG1 

Records  processed  one  at 
a  time 

#  Average  Input 

Positive  Integer 

Records 

•  Minimum  Input 

Non-Negative 

Integer 

Records 

•  Maximum  Input 

Positive  Integer 

Records 

TABLE  7.  TASK  CHARACTERIZATION  (Sheet  2  of  2) 


ATTRIBUTE 

VALUES 

UNIT/MEANING 

Data  Storage  for  Each 
Task  Output: 

•  Block  Level 

1  Character 

See  Table  5 

e  Block  Identifier 

6  Characters 

See  Table  5 

e  When 

6-Character  Code 

>  'ALONG' 

Records  are  output  via  indi¬ 
vidual  processing 

»  'END' 

Records  are  output  just  prior 
to  task  exit 

e  Average  Output 

Positive  Integer 

Records 

e  Minimim  Output 

Non-Negative 

Integer 

Records 

e  Maximun  Output 

Positive  Integer 

Records 

Enablement 

•  Type 

4-Character  Code 

«  'TIME' 

Time  Enabled 

-  'DATA' 

Data  Enabled 

•  ‘SLV0‘ 

Slaved  to  Master  Task 

•  'TAO* 

Time  and/or  Data  Enabled 

e  Frequency  1 

Real 

Iterations/Second  for  Time 
Enablement 

e  Frequency  2 

Real 

Iterations/Second  for  Data 
Enablement 

e  Frequency  3 

Real 

Iteratidns/Second  for  Slaved 

6 


TABLE  8.  PROCESSOR  CHARACTERIZATION 


ATTRIBUTE 

VALUES 

UNIT/MEANING 

Identifier 

10  Characters 

Unique  Identifier  for  pro¬ 
cessor  with  the  following 
attributes 

Operating  System 

e  Multitasking 

A  Levels 

Integer 

.GE.l 

A  Number  of 

Priority  Levels 

Integer 

.GE.O 

.LE.  Levels 

These  many  levels  are  ser¬ 
viced  In  a  priority  fash¬ 
ion.  The  remaining  levels 
are  serviced  in  a  circular 
time-shared  fashion. 

e  Enablements 

Integer 

Enab 1 emen t s / Sec ond 

A  Maximum  Time 
Enablement 
Frequency 

A  Resource 

Management  per 

Time  Enablement 

F10.9.6E.0 

Microseconds 

A  Max4rvn  Oata 
Enablement 
Frequency 

Integer 

Enablements/Second 

A  Resource 

Management  per 

Oata  Enablement 

F10.9.GE.0 

Microsecond 

A  Maximum  Slaved 
Enablement 
Frequency 

Integer 

Enablements/Second 

A  Resource 

Management  per 
Slaved  Enable¬ 
ment 

F10.9.GE.0 

Microseconds 
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TABLE  8.  PROCESSOR  CHARACTERIZATION  (Sheet  2  of  3) 


ATTRIBUTE 

VALUES 

UNIT/MEANING 

•  For  Each  Task 

Level  L 

A  Maximum  Number 

Integer  .GE.l 

of  Task  Level  L 

ATask  Service 

Code 

Scheme  for 

Level  L 

.  .p. 

Priority 

■  'C1 

Circular 

*  'F' 

First-In,  Flr.t  Out 

e  Level  Resource 

FIG. 9  .GE.G 

Microseconds 

Management 

Simulation  Instruction 
Set  Measurements  for 
Each  Benchmark 
Instruction  1 

a  Sizing 

Measurements 

A  Number  of  Code 
Memories 

Involved 

The  Memory  Type  for 

4-Character  Code 

Must  agree  with  master 

Each  Code  Memory  m 

memory  types  defined  In 

(the  first  memory  Is 

Group  4 

the  user  task  code  -- 
any  other  memories  are 
predefined  for  this 
processor) 

A  Length  of  Code 

Integer  .GE.l 

Number  of  basic  units  used 

In  Memory  m 

to  describe  memory  m  (see 

Group  4)  \ 

\ 
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TABLE  8.  PROCESSOR  CHARACTERIZATION  (Sheet  3  of  3) 


ATTRIBUTE 

VALUES 

UNIT/MEANING 

•  Timing  Measurements 
for  Each  Code 

Memory  m  and  k*l,2 

k»l  Implies  Average 
k*2  Implies  Worst  Case 

A  Number  of  Scratch 
Data  Store  Walts 

Integer  .GE.0 

A  Number  of  Scratch 
Oata  Store  Waits 

Integer  .GE.0 

A  Computational 

Total  for  All 
Memories 

Integer  .GE.0 

Cycles 

e  Application  Develop¬ 
ment  Measurements 

Using  Language  L  of 
the  Master  Language 
List 

A  One  Item  Develop¬ 
ment  Charge 

Integer 

Man-hours 

A  Change  per  Appli¬ 
cation  Instruction 
of  this  Type 

Integer 

Man-hours 

applicable  to  software  partitioning  relate  to  multitasking  disciplines, 
limits,  and  resource  management  services.  The  instruction  set  is 
characterized  in  terms  of  the  master  simulation  instruction  set  as 
described  in  Section  4. 2. 1.3,  along  with  attributes  for  user  memory  I/O 
versus  preprogrammed  resources  plus  development  cost  estimates. 

4.2.2  Suggested  Input  Forms 

The  forms,  as  designed,  may  be  used  directly  by  a  data  keying 
operator  to  produce  keypunched  cards  or  entry  directly  onto  a  file  via  an 
interactive  data  entry  terminal.  Specific  physical  file  formats  are  not 
specified  since  they  will  be  a  function  of  selected  computer  file  image 
capabilities  described  in  Section  4.3.  Because  of  the  volume  of  input 
sheets,  they  are  presented  in  Appendix  A  for  each  of  the  data  base  files. 

During  the  design  of  the  input  forms,  emphasis  was  placed  on 
consolidation  and  cross-reference  techniques  that  facilitate  an  organ¬ 
ized  straightforward  user  input  interface.  The  software  partitioning 
algorithm  requires  an  assortment  of  specific  data  to  fully  define 
trainer  system  interfaces  plus  computational  hardware  and  software 
design  details  that  must  be  accurate  if  a  good  partition  allocation  is  to 
be  obtained.  The  separation  of  forms  is  based  on  the  five  major  input 
areas,  and  it  is  recommended  that  these  areas  be  standardized  for  pre¬ 
senting  the  respective  interface  requirements,  software  task/data  design 
relationships,  candidate  hardware  design  configuration,  technology  capa¬ 
bilities,  and  evaluation  priorities,  including  the  candidate  initial 
design  allocation  as  a  starting  point  for  partitioning  optimization. 


4.3  TARGET  COMPUTER  AND  SOURCE  LANGUAGE  SELECTION 

The  selection  of  the  computer  system  for  the  partitioning  algo¬ 
rithm  should  consider,  as  a  minimum,  the  following  features,  which  must 
be  incorporated  to  facilitate  automatic  implementation  of  the  partition¬ 
ing  algorithm  and  its  potential  expansions: 

1.  Data  base  management  system 

2.  Structured  program  language 

3.  Modified  linear  mixed  integer  program  optimizer 

4.  Computational  speed  and  accuracy. 

Each  of  these  features  is  described  in  more  detail  in  the  follow¬ 
ing  paragraphs. 


4.3.1  Data  Base  Management  System 

The  interrelated,  yet  separate,  data  files  (described  earlier  in 
this  section)  of  the  recommended  flight  trainer  automated  repository  are 
best  implemented  under  a  standard  data  base  management  system  that 
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permit*  creation,  update  maintenance,  and  configuration  management  of 
all  data  and  program  files.  It  is  recommended  that  system  data  file 
management  utilities  be  available  to  the  user  in  several  different 
SK>des,  including  batch  job  control,  interactive  terminal  commands,  and 
user  program  code  directives  to  permit  a  flexible,  yet  controlled,  data 
access  environment.  Direct  record  access  capability  is  an  essential 
feature  for  isq>lementation  of  the  software  task  and  block  description 
plus  the  technology  data  base  files. 

The  amount  of  data  is  a  function  of  the  flight  training  simulator 
cosqratational  candidate  designs  to  be  evaluated.  Table  9  provides  an 
abbreviated  summary  of  sizing  relationships  for  each  record  type  group 
contained  in  the  respective  files  required  for  the  partitioning  algo¬ 
rithm.  The  data  base  management  should  include  memory  management  of  code 
and  data  required  for  execution.  Internal  tables  utilized  by  the  algo¬ 
rithm  are  sized  in  Table  10.  The  algorithm  code  is  estimated  to  be 
10,000  lines  of  structured  FORTRAN  exclusive  of  potential  data  sianager 
and  optimizer  extensions. 

4.3.2  Structured  Program  Language 

Evaluation  code  (code  used  to  facilitate  manual  analysis)  is  a 
very  useful  tool  if  it  can  be  suuntained  under  configuration  control  and 
permit  expansion  to  more  detailed  models  when  necessary  for  a  given 
evaluation  analysis.  Structured  source  code  facilitates  modularity  and, 
thus,  permits  model  expansion.  Several  source  languages  are  included 
here  as  candidates  for  the  partitioning  algorithm  implementation, 
including  FORTRAN  77,  JOVIAL,  and  ADA.  These  languages  were  selected 
based  on  current  DOD-approved  languages  and  language  development  activi¬ 
ties.  Pros  and  cons  for  each  are  now  presented. 

The  widespread  recognition  of  FORTRAN  for  scientific  and  mathe¬ 
matical  programming  makes  it  the  preferred  language  of  the  three  lan¬ 
guages  considered.  The  newest  ANSI  FORTRAN  77  standards  incorporate 
character  manipulation,  which  is  independent  of  machine  architecture. 
Its  use  of  structured  logic  includes  both  true  and  false  process  defini¬ 
tions  without  the  use  of  extraneous  "00  TD's."  File  manipulation  capa¬ 
bilities  have  also  been  expanded  to  include  file  status  checks  and 
standardisation  of  certain  types  of  data  storage/retrieval  mechanisms 
that  have  previously  required  vendor-peculiar  FORTRAN  extensions.  Some 
problems  may  be  encountered  with  new  compilers  being  released  to  meet  the 
new  FORTRAN  standards,  but  these  compilers  should  evolve  rather  quickly 
to  support  most  of  the  ANSI  77  features.  This  will  result  in  code  that 
is  more  easily  transported  from  one  machine  to  another.  This  is  an 
important  aspect,  since  the  partitioning  algorithm  does  not  require  a 
dedicated  computer  system,  and  as  such,  it  is  envisioned  as  being  a 
useful  tool  for  flight  training  simulator  developers  and  maintenance 
reconfiguration  analysts,  as  well  as  for  Air  Force  evaluators.  Each  of 
these  specialists  generally  has  his  own  in-house  computer  system 
tailored  for  specific  analysis  needs. 
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TABLE  10.  INTERNAL  ALGORITHM  TABLE  SIZING  REQUIREMENTS 


TABLE 

IPT  NO. 

TABLE  TITLE 

WORDS 

(60-bit  words) 

1 

Limits,  Constants,  and  Code 

20 

2 

Current  Problem  Sizing  Controls 

9 

3 

Priority  Controls 

28 

4 

Current  Processor  List 

P*(13+i ) 

5 

Current  Memory  List 

11*M 

6 

Current  Conmunication  Link  List 

(3+3*QND)*9 

7 

Current  External  Device  List 

(4+DB)*d 

8 

Task/Processor  Allocation  and 
Restrictions 

9*T*P 

9 

Memory/Processor  Coirmuni  cat  ions 
Allocation  and  Restrictions 

(4+4e)*M*P 

10 

Memory/Block  Allocation  and 
Restrictions 

5*M*B 

11 

Master  Block  List 

(11+M+2T)*B 

12 

Master  Task  List 

(16+5i+6*B+e)*T 

13 

Scratch  and  Local  Parameters 

To  be  Defined 
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JOVIAL  is  mentioned  because  o£  its  recognition  by  the  Air  Force 
as  a  standard  language  for  embedded  computer  systems  development.  A 
major  drawback  is  its  limited  I/O  capabilities,  which  is  a  major  factor 
with  regard  to  the  partitioning  algorithm's  large  data  base  handling 
requirements. 

ADA  is  also  mentioned  since  it  is  the  DOD  language  being 
developed  with  source  language  standardization  as  a  major  goal  to  sup¬ 
port  software  development  of  new  military  computational  subsystems.  The 
on-going  compiler  developments  are  limited  to  experimental  compilers  and 
compiler  design  efforts.  Therefore,  at  this  time  it  is  not  a  feasible 
candidate  for  actual  algorithm  development  and  testing.  It  will  be  2  to 
3  years  before  it  is  available  in  an  operational  development  setting. 
Further  implementation/expansion  should  monitor  and  consider  ADA  since 
its  features  will  permit  more  configuration  control  as  well  as  the  struc¬ 
tured  expression  of  concurrent  process  control  flows,  I/O,  and  computa¬ 
tions  with  concise  data  base  definition. 

In  conclusion,  FORTRAN  is  the  recommended  language  for  imple¬ 
mentation  of  the  partitioning  algorithm. 

4.3.3  Modified  Linear  Mixed  Integer  Program  Optimizer 

The  partitioning  algorithm  has  the  potential  for  future  inter¬ 
faces  with  a  modified  linear  program  mixed  integer  program  optimizer. 
The  current  algorithm  design  is  based  on  a  heuristic  algorithm  driver 
that  assumes  that  an  initial  feasible  partition  exists  with  respect  to 
the  basic  real-time  processing  requirements  of  data  availability,  task 
timing,  and  less  than  100Z  processor/memory  allocation.  From  this 
initial  feasible  solution,  it  seeks  to  determine  and  make  improvements 
on  the  initial  partition  with  respect  to  three  goals:  (a)  processor  load 
balance  within  given  growth  allotments,  (b)  memory  utilization  within 
growth  tolerances,  and  (c)  minimization  of  development  costs.  Although 
heuristics  do  not  guarantee  an  optimal  solution,  it  is  anticipated  that 
the  complexity  of  priorities  and  data  constants  will  change  frequently, 
which  makes  the  finding  of  the  true  optimal  a  meaningless  exercise. 
However,  optimizers  can  be  employed  to  help  find  an  initial  feasible 
solution  and  to  find  optimal  subset  solutions  under  the  control  of  the 
heuristic  decision  tree.  In  the  case  of  the  partitioning  algorithm,  the 
initial  feasible  solution  poses  the  largest  problem  in  terms  of  sizing 
and  numeric  accuracy  techniques  that  are  required.  Table  1  summarizes 
the  optimizer  sizing  as  a  function  of  the  size  of  candidate  designs  to  be 
evaluated. 


4.3.4  Computational  Speed  and  Accuracy 

Although  the  partitioning  algorithm  is  not  as  demanding  as  real¬ 
time  simulation  or  control  codes,  it  is  important  that  it  be  able  to 
support  quick-turnaround  evaluation  runs  to  expedite  the  given  evalua¬ 
tion  case.  The  complexities  of  the  processor  utilization  calculations 


in  terms  of  task  computations,  resource  management,  and  I/O  are  iterated 
with  respect  to  potential  processor  tradeoffs  for  load  balance  calcula¬ 
tions  that  involve  a  variety  of  attributes.  Since  the  basic  computations 
are  subject  to  mathematical  model  expansions  and  changes,  floating  point 
capabilities  are  recommended  to  permit  new  equations  to  be  introduced, 
as  required,  without  the  burden  of  fixed-point  scaling. 

Units  have  been  selected  to  keep  related  variable  numeric  order 
of  magnitudes  within  computational  limits  of  most  scientific  machines. 
These  units  should  be  periodically  examined  as  technology  advancements 
are  made.  For  example,  many  current  real-time  flight  trainer  applica¬ 
tion  cycles  are  based  on  1-sec  intervals  with  subcycles  or  subframes 
measured  in  terms  of  milliseconds.  As  timing  improvements  are  made, 
these  may  take  on  smaller  increments  of  time  for  application  cycling, 
hence  the  need  for  their  periodic  reappraisal.  Another  factor  is  machine 
cycle  time,  which  is  currently  measured  in  nanoseconds;  thus,  certain 
calculations  involving  memory  I/O  must  be  accumulated  separately  to 
obtain  totals  that  can  then  be  used  to  determine  any  appreciable  I/O 
timing  for  tasks  that  handle  large  volumes  of  data  in  addition  to  compu¬ 
tational  processing.  Typically,  32-bit  floating  point  can  represent  six 
significant  digits.  Thus,  if  a  basic  unit  is  assumed  to  be  1  sec,  the 
nanosecond  effectively  is  disregarded  unless  accumulated  separately. 
However,  if  either  double  precision  (64  bit)  or  60-bit  single  precision 
is  used,  there  is  no  problem.  An  alternative  is  for  task  memory  I/O, 
resource  management,  and  individual  instruction  timing  computations  to 
be  accumulated  for  total  task  time  in  microseconds,  and  then  task  times 
may  be  added  separately  for  a  given  application  cycle  time  in  terms  of 
current  task/cycle  relationships.  Thus,  there  is  the  need  for  floating 
point,  with  a  minimum  of  32-bit  words  sufficing  for  most  operations,  and 
either  segmented  units  or  double  precision  variables  to  account  for 
application  subtask  timing  computations. 

The  use  of  preemptive  priorities  rather  than  weighted  priorities 
permits  processor  loading,  memory  allocation,  and  development  costs  to 
remain  in  their  standard  units  without  any  input  scaling  and  output 
rescaling.  However,  in  each  priority  level,  numbers  for  a  given  task  or 
data  block  should  be  summed  separately  from  totals  being  used  for  total 
memory  or  total  processor  utilization  to  avoid  underflow  accumulation 
problems. 

4.4  RECOMMENDED  IMPLEMENTATION  SCHEDULE 

The  major  tasks  and  their  hierarchical  relationships  are 
depicted  in  Figure  16.  Each  of  these  tasks  is  briefly  described  in  this 
section  with  cross-references  to  appropriate  report  sections  for  related 
details.  Although  some  parallel  task  sequences  are  depicted,  there  are 
some  interdependencies,  as  denoted  in  Figure  16.  These  interdepend¬ 
encies  are  basically  handled  at  major  detailed  reviews,  which  are  recom¬ 
mended  to  be  held  quarterly  to  assess  the  implementation  progress,  to 
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ensure  that  interface  definitions  are  adhered  to,  and  to  establish  more 
detailed  interfaces  as  the  appropriate  operational  consideration  details 
become  known. 


Figure  17  groups  the  tasks  into  four  major  implementation  phases 
over  a  2.5-year  period.  There  is  an  overlap  between  Phase  III  and  Phase 
IV,  with  the  major  emphasis  of  Phase  III  placed  on  basic  (as  currently 
designed)  algorithm  validation  and  with  Phase  IV  emphasis  on  an  expanded 
validated  model  incorporating  an  optimizer  for  selected  aspects  of  the 
partitioning  algorithm.  The  implementation  tasks  are  now  described  by 
phase.  To  make  a  complete  task  statement,  there  is  some  redundancy  with 
earlier  report  sections.  Cross-references  are  made  to  avoid  excessive 
redundancy . 


4.4.1  Model  Validation  Plan  and  Selected  Computer  Interfaces 

Although  the  candidate  computer  selection  aspects  have  been  de¬ 
scribed  (Section  4.3),  the  specific  computer  implementation  must  be  fur¬ 
ther  delineated  to  obtain  a  practical  partitioning  allocation  and  eval¬ 
uation  tool  for  flight  trainer  simulator  computational  candidate  design. 
Existing  evaluation  computer  facilities  should  be  reviewed  for  current 
formats  and  data  collection  procedures  in  addition  to  the  current  com¬ 
puter  capabilities  to  contribute  basic  inputs  to  the  Phase  I  tasks,  which 
are  now  briefly  described. 

4. 4. 1.1  Validation  Plan  -  The  sample  problems  manually  demon¬ 
strated  under  this  contract  have  verified  the  feasibility  of  the  parti¬ 
tioning  algorithm  design.  However,  they  do  not  constitute  a  model  cali¬ 
bration  case  from  which  a  confidence  level  of  model  validity  may  be 
derived.  As  evidenced  in  the  mathematical  statement  of  the  partitioning 
problem  (Section  3.1),  there  are  many  interrelated  variables  and  factors 
that  drive  the  partitioning  process,  necessitating  some  parametric  auto¬ 
mation  techniques  to  fully  analyze  the  automated  design  validity  and 
stability  for  real-world  data.  The  validation  plan  will  permit  con¬ 
trolled  algorithm  implementation  testing  to  determine  its  validity  with 
respect  to  known  partitioning  situations  of  selected  flight  training 
simulator  computatational  designs.  By  addressing  evaluation  partition¬ 
ing  problems  to  be  handled  prior  to  algorithm  coding,  the  evaluation 
community  is  essentially  establishing  the  foundation  for  the  algorithm 
acceptance  test  with  respect  to  its  role  as  an  evaluation  tool. 

As  a  minimum,  the  validation  plan  should  identify  the  flight 
trainer  system(s)  to  be  used  as  the  algorithm  implementation  baseline. 
It  should  also  extrapolate  intended  sizing  of  the  algorithm  application 
in  terms  of  the  number  of  each  data  base  item  described  in  Section  4.2 
(i.e.,  number  of  tasks,  blocks,  processors,  memories,  etc.).  A  set  of 
test  cases  should  be  drafted  in  an  outline  format  as  to  specific  algo¬ 
rithm  features  to  be  incorporated  and  tested  for  both  the  basic  model  and 
the  expanded  model. 
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Figure  17.  Projected  time  relationship  of  tasks. 


4. 4. 1.2  Data  Base  Interface  -  The  specific  flight  trainer  com¬ 
putational  design  repository  format  and  data  base  management  utilities 
should  be  delineated  by  this  task.  This  includes  finalization  of  the 
user  interface  formats  (such  as  those  contained  in  Appendix  A)  and  the 
format  by  which  the  partitioning  algorithm  may  retrieve  its  inputs  and 
store  its  outputs  with  respect  to  the  repository  and  the  interactive 
and/or  batch  user. 

This  task  incorporates  the  data  collection,  storage,  and 
retrieval  mechanisms,  plus  quality  assurance  steps  necessary  for  algo¬ 
rithm  implementation  and  usage.  The  repository  data  management  should 
incorporate  responsible  agencies  for  each  input  area  and  make  maximum 
use  of  pre-editing  and  file  management  utilities  of  the  selected  com¬ 
puter  system.  The  results  of  this  task  should  be  compiled  in  the  form  of 
a  users'  manual  for  the  flight  trainer  design  repository  and  specifi¬ 
cally  address  the  partitioning  algorithm  interfaces.  These  interfaces 
include  the  master  design  simulation  language  instruction  set  and  guide¬ 
lines  for  processor,  memory,  task,  and  data  baseline  descriptions 
(covered  in  Section  4.2)  that  will  streamline  the  orderly  preparation  of 
inputs  and  permit  gradual  controlled  growth  into  a  fully  tested  and 
implemented  repository  system  for  multiple  evaluations. 

4.4. 1.3  Computer  Selection  -  Computer  candidate  selection  has 
been  discussed  in  Section  4.3.  This  task  ties  Phase  I  activities 
together  to  determine  the  specific  coding  standards  and  interfaces  to  be 
employed  for  algorithm  implementation  for  a  given  computer  facility. 

4.4. 1.4  Optimizer  Interface  -  This  task  permits  the  long-range 
interface  goals  to  be  defined  for  potential  optimization  steps  in  the 
heuristically  driven  partitioning  algorithm.  This  is  a  major  area  for 
further  study  and,  as  such,  is  recognized  in  Section  3.3. 

4.4.2  Automated  Algorithm  Verification  and  User  Design 
Foundation 


Phase  11  permits  the  initial  automation  of  the  basic  algorithm 
and  delineates  additional  programs  that  will  aid  in  the  bookkeeping  and 
increase  computational  confidence  levels  of  an  expanded  partitioning 
algorithm.  Each  of  the  tasks  is  now  defined. 

4.4.2. 1  Establish  Model  Validation  Procedures  -  This  task 
expands  and  enumerates  the  test  cases  outlined  in  the  test  plan  of  Phase 
1.  The  nature  of  the  basic  partitioning  algorithm  is  to  seek  and,  if 
pos8ible,-'’find  an  improved  partition  of  tasks.  Thus,  the  test  procedures 
must  include  the  means  for  reconfiguring  the  subject  flight  trainer  for 
which  sr^upposedly  "better"  partition  has  been  found.  In  addition, 
related  performance  measurements  of  the  newly  partitioned  configuration 
must  be  specified  as  to  what  and  how  they  are  to  be  collected  and 
evaluated  to  access  the  predicted  partition  improvements  of  the  parti¬ 
tioning  algorithm.  To  assist  in  this  step,  the  multiple  processor 
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simulator  being  designed  under  separate  contract  may  be  used  to  provide  a 
quick  look  at  the  dynamic  aspects  of  the  new  partition  prior  to  making  a 
reconfiguration  decision.  All  of  these  considerations  must  be  placed 
into  a  timeline  for  algorithm  validation  testing  to  account  for  permis¬ 
sible  reconfiguration  in  the  partitioning  restriction.  For  example,  if 
special-purpose  tasks  may  only  reside  on  special-purpose  processors, 
they  should  be  declared  as  such  in  the  partitioning  algorithm  evaluation 
options.  Thus,  realistic,  measurable  validation  test  procedures  are  the 
goal  of  this  task. 

4. 4. 2. 2  Design  Repository  Programs  -  The  users'  manual  of  Phase 
I  will  undoubtedly  require  specific  repository  storage/retrieval  pro¬ 
grams  to  be  designed  to  augment  the  system-supplied  data  base  eapabili- 
ties  to  support  the  flight  trainer  eyaluators  "input  jargon"  and  to 
efficiently  handle  the  input  and  subsequent  updates  to  each  of  the  vari¬ 
ous  files  to  ensure  consistency  and  completeness  of  any  given  repository 
transaction.  The  results  of  this  task  constitute  the  detailed  design  of 
each  and  all  repository  programs  to  be  implemented  in  Phase  III. 

4. 4. 2. 3  Code/Verify  Basic  Algorithm  -  This  task  is  the  most 
straightforward  of  all  of  the  tasks  and  simply  entails  the  coding,  debug¬ 
ging,  and  verifying  of  the  basic  algorithm  as  designed  and  demonstrated 
as  part  of  this  subject  contract.  This  provides  the  working  baseline  for 
all  future  expansion  in  both  model  repository  and  optimizer  interfaces. 
The  results  of  this  task  provide  a  source  code  listing,  verification  test 
case  execution  outputs,  and  documented  interpretation. 

4. 4. 2. 4  Design  Optimizer  Programs  -  The  emphasis  of  this  task  is 
to  be  placed  on  upgrading  and  complementing  an  existing  mathematical 
optimizer  package  selected  in  Phase  I  with  respect  to  computational  and 
logic  needs  peculiar  to  the  partitioning  application.  This  task 
requires  extensive  knowledge  and  experience  with  mathematical  optimiza¬ 
tion  codes  and  their  numerical  stability  in  terms  of  accuracy,  scaling, 
iteration,  and  masking  techniques  that  can  judiciously  expedite  the 
solution  space  search  for  initial  feasible  solutions.  The  task  also 
requires  knowledge  and  experience,  with  optimal  subproblem  solutions  as 
called  by  the  heuristic  driver  of  the  basic  algorithm.  The  results  of 
this  task  will  comprise  the  detailed  design  of  programs  to  be  implemented 
to  support  the  optimizer  interface. 

4.4.3  Basic  Model  Validation  and  Expanded  Program  Interface 
Development 

This  critical  phase  permits  the  large-scale,  real-world  data 
assessment  of  the  basic  algorithm  to  be  made.  The  first  part  of  Phase 
III  is  associated  with  specific  data  collection,  scripting,  and  support 
program  coding.  The  latter  part  of  this  phase  incorporates  efforts  of 
the  first  part  for  basic  algorithm  validation  testing.  In  addition,  the 
optimizer  programs  are  developed  in  preparation  for  the  Phase  IV 
expanded  model.  Each  of  the  Phase  III  tasks  is  now  described. 
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4.4. 3.1  Script  Validation  Data  -  Validation  input  data  must  be 
collected  and  prepared  utilizing  the  validation  input  procedures  for 
each  test  case  for  basic  algorithm  and  expanded  algorithm  validation 
cases.  A  test  case  can  not  proceed  until  its  basic  inputs  have  been 
properly  prepared. 

4.4. 3.2  Develop  Repository  Programs  -  The  programs  designed  in 
task  2.2  of  Phase  II  are  coded,  debugged,  and  verified  by  means  of 
validation  input  procedures  to  assist  in  the  input  processing  of 
task  3.1. 


4.4. 3. 3  Develop  Optimizer  Programs  -  This  task  codes  and  debugs 
the  programs  designed  in  task  2.4  of  Phase  II  in  preparation  for  expanded 
algorithm  verification  and  validation  of  Phase  IV. 

4.4. 3.4  Validate  Basic  Algorithm  -  Each  validation  test  case  is 
made  in  the  order  prescribed  in  the  test  procedures.  If  any  problems  are 
encountered,  their  impact  on  the  test  plan  and  case  procedures  must  be 
fully  evaluated  to  determine  what  action,  if  any,  is  necessary  to  con¬ 
tinue  the  test  program.  All  test  execution  reports  should  be  included  as 
appendices  to  the  test  summary  report.  It  is  anticipated  that  certain 
validation  tests  can  be  run  prior  to  complete  implementation  of  the 
repository  to  exercise  the  fundamental  paths  of  the  algorithm. 

4.4.4  Expanded  Model  Verification,  Validation,  and  Formal 
Acceptance  Testing 

Phase  IV  paves  the  final  path  to  the  realization  of  the  parti¬ 
tioning  algorithm  as  part  of  the  standard  flight  trainer  simulator  comp¬ 
utational  design  evaluation  and/or  design  guide  tool.  The  full  reposi¬ 
tory  and  added  optimizer  capabilities  developed  in  the  first  three 
phases  are  now  integrated  and  tested  to  provide  a  controlled  user  inter¬ 
face  for  multiple  evaluation  situations.  The  tasks  are  now  defined. 

4.4.4. 1  Verify  Expanded  Model  -  This  task  consists  of  selected 
basic  algorithm  test  cases  to  verify  that  these  cases  are  still  properly 
handled  in  the  expanded  model.  In  addition,  new  path  verification  tests 
are  incorporated  by  the  designer  to  verify  that  new  capabilities  are 
working  as  designed. 

4. 4. 4. 2  Validate  Expanded  Model  -  This  task  performs  the  exten¬ 
sive  testing  as  defined  in  the  validation  procedures  for  the  extended 
model.  As  with  basic  algorithm  validation,  if  any  problems  are 
encountered,  their  impact  on  the  test  program  must  be  evaluated  and  it 
must  be  determined  whether  any  action  is  necessary  for  continuance  of  the 
test  program.  All  execution  results  should  be  included  as  appendices  to 
the  test  summary  docuawntation. 

4.4.4. 3  Formal  Acceptance  Test  -  The  complexity  of  the  parti¬ 
tioning  algorithm  and  its  potential  evaluation  decision-making  impact 
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necessitates  the  need  for  formal  Government  acceptance  tests.  These 
tests  should  be  scripted  and  performed  by  an  independent  organization  to 
fully  assess  the  delivered  capability  with  respect  to  completeness  of 
documentation,  configuration,  quality,  and  purpose.  The  major  developer 
is  involved  as  a  consultant  to  explain  or  expand  documents  and  to  respond 
to  any  questions  concerning  the  delivered  operational  package.  It  is 
anticipated  that  Government  flight  trainer  system  evaluators  will  be 
responsible  for  scripting  and  conducting  these  independent  test  proce¬ 
dures  since  the  test  will  serve  as  a  training  task  that  emphasizes  the 
intended  operational  user  environment  of  the  algorithm. 

4. 4. 4. 4  Final  Report  -  The  emphasis  of  this  task  is  to  be  placed 
on  finalizing  documentation  of  the  automated  algorithm  capabilities, 
findings,  and  conclusions.  This  documentation  should  be  accompanied 
with  the  final  user,  test,  and  program  maintenance  documentation  for 
specific  program  implementation  details. 
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5.  CONCLUDING  REMARKS 


Software  partitioning  is  a  complex,  design  development/ 
evaluation,  decision-making  process  with  many  tradeoffs  to  be  analyzed 
for  selecting  a  good  candidate  flight  training  simulator  computational 
design  for  a  particular  operational  trainer  implementation  or  upgrade. 
This  section  briefly  summarizes  the  details  presented  in  Sections  2 
through  4  in  terms  of  the  study  findings,  related  work,  and  areas  for 
further  study. 


S.l  FINDINGS 


Candidate  software  designs  expressed  independently  of  candidate 
hardware  are  the  basic  key  design  feature  that  permits  software  parti¬ 
tioning  flexibility.  This  is  not  the  traditional  design  approach  cur¬ 
rently  in  use  for  system  design.  This  project  has  defined  the  types  of 
design  data  that  will  permit  independent  assessment  of  baseline  software 
tasks  for  alternative  multiple-processor  configurations.  The  key  data 
areas  are  the  establishment  of  a  standard  design  language  and  an  auto¬ 
mated  repository  for  the  given  application  design  data. 

The  partitioning  algorithm  has  been  designed  as  a  general  parti¬ 
tioning  algorithm  for  software  systems,  and  it  is  the  data  collection 
process  (Section  4.2)  that  will  make  this  algorithm  unique  for  a  given 
application  implementation.  In  this  way,  it  is  seen  as  a  useful  tool  for 
the  evaluation  of  a  wide  variety  of  computational  subsystem  designs 
since  it  is  not  constrained  to  current  configuration,  technology,  or 
application. 


5.2  RELATED  WORK 


The  results  of  this  effort  are  closely  coordinated  with  Contract 
No.  F33615-79-C-0003  for  the  AFHRL  Advanced  Multiple  Processor  Configu¬ 
ration  Study.  The  multiple-processor  study  is  concerned  with  features 
and  techniques  for  assessing  the  predicted  performance  of  given  alterna¬ 
tive  candidate  designs.  The  partitioning  algorithm  is  looking  at  task/ 
data  allocation  from  a  static  analysis  point  of  view  to  ensure  that  real¬ 
time  computational  requirements  are  met  with  a  balanced  load.  The  number 
of  entities  that  must  be  considered  requires  that  parametric  analysis  in 
terms  of  average  or  worst-case  numbers  be  used  in  the  partitioning 
process.  The  dynamic  environment  of  the  flight  trainer  computational 
task  allocation  requires  the  addition  of  network,  queuing,  and  simula¬ 
tion  (batch  mode)  tools  to  predict  and  assess  the  performance  of  a  given 
allocation  partition  with  respect  to  representative  scenario  loads  and 
resource  management  rules.  The  multiple-processor  configuration  con¬ 
tract  is  incorporating  and  expanding  the  conceptual  repository  to 
include  the  dynamic  performance  design  aspects  that  are  pertinent  to 
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alternative  computational  candidate  design  evaluations  for  operational 
flight  trainers.  The  results  of  this  related  effort  are  to  be  published 
in  the  final  report  scheduled  to  be  distributed  on  or  about  31  Oct  80. 


5.3  AREAS  FOR  FURTHER  STUDY 


Advancements  in  systems  development  and  training  features  are 
sources  of  continuous  change  for  flight  trainer  systems.  A  "good"  system 
today  may  be  obsolete  in  5  years  or  less  if  it  does  not  possess  modular 
design  capabilities.  This  is  particularly  true  of  the  computational 
system,  which  must  act  as  a  coordinator,  interface,  and  decision-maker 
to  assist  the  human  operators  and  commanders  to  better  perform  their 
jobs.  As  new/upgraded  flight  trainer  systems  are  required,  the  basic 
design  models  plus  nev/modified  modules  may  very  likely  require  reallo¬ 
cation  of  new  processor,  communication,  and  memory  technologies.  Two 
major  areas  of  study  have  been  isolated  as  the  key  to  potential  reali¬ 
zation  of  a  truly  automated  software  partitioning  algorithm: 

1.  The  employment  and  expansion  of  mathematical,  mixed  integer, 
program  optimizer  techniques  for  large-scale  partitioning 
with  multiple  objectives 

2.  The  development  of  a  master  flight  training  simulator  compu¬ 
tational  subsystem  design  repository. 

These  two  areas  have  been  incorporated  as  major  tasks  associated  with 
automation  of  the  partitioning  algorithm  described  in  Section  4.4. 

In  conclusion,  automated  software  partitioning  is  feasible.  It 
will  require  further  study,  design,  and  test  steps  that  are  directly  re¬ 
lated  to  computer  facility  selection  for  its  implementation.  The  major 
training  simulator  candidate  design  impact  would  be  toward  standardiza¬ 
tion  and  separation  of  the  software  design  representation  and  data  from 
processor  hardware  configuration  representations  and  data.  The  results 
of  the  standardization  would  permit  a  consistent  flight  trainer  computa¬ 
tional  design  automated  repository  to  be  established  and  used  in  both  new 
design  and  current  design  evaluation  tradeoffs  in  the  areas  of  software 
partitioning  and  predicted  performance  of  multiple-processor  configu¬ 
rations.  The  use  of  an  optimizer  will  permit  certain  tradeoffs  to  be 
automatically  made  and  determined  in  a  more  straightforward  manner,  per¬ 
mitting  more  time  for  suinual  evaluation  comparisons  and  decisions. 
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